A Survey on Software Defect Prediction: Supporting Tablesemad/pubs/sdptables.pdfusing product metrics maybe explained by the large number of studies that used OO metrics to predict

A Survey on Software Defect Prediction: Supporting Tables

Emad Shihab and Ahmed E. Hassan

1. DATA SOURCES AND GRANULARITY

Table I shows the data sources and granularity used in the surveyed SDP papers. Each row of the table representsone of the surveyed papers. An ‘o’ means the criteria is applied by the paper, a ‘.’ means the criteria was notapplied by the paper and a ‘?’ means we could not determine whether or not the criteria was applied. An ‘NA’means the criteria is not applicable for that paper. We had to use our understanding and judgement to determinethe difference between when a paper did not apply a criteria (i.e., ‘.’) or we could not determine whether a criteriawas used (i.e., ‘?’). To illustrate the difference, we provide the following example. If a paper mentions that oneproject is used and it is a commercial project, then we would determine that no open source project is used (andmark the open source column as ‘.’ for that project). However, if the programming language of the project is notmentioned, then we would mark the programming language column as could not determine (‘?’), since we knowthat every project must be written in some sort of programming language, but we could not determine what itwas. []

Emad Shihab (corresponding author) is with the Department of Software Engineering at the Rochester Institute of Technology, Rochester,NY, USA. Ahmed E. Hassan is with the School of Computing at Queen’s University, Kingston, ON, Cnada.Permission to make digital/hard copy of all or part of this material without fee for personal or classroom use provided that the copies arenot made or distributed for profit or commercial advantage, the ACM copyright/server notice, the title of the publication, and its date appear,and notice is given that copying is by permission of the ACM, Inc. To copy otherwise, to republish, to post on servers, or to redistribute tolists requires prior specific permission and/or a fee.c© 2012 ACM 1529-3785/2012/0700-0001 $5.00

ACM Transactions on Computational Logic, Vol. V, No. N, December 2012, Pages 1–0??.

2 · A Survey on Software Defect Prediction

Table I: Data Sources and Granularities Used in SDP Studies. An ‘o’ means applied, a ‘.’means not applied and ‘?’ means could not determine.

Repository Project Granularity

Paper

Sour

ceC

ode

Bug

Oth

er

Ope

nSo

urce

Com

mer

cial

Proj

ectN

ame

#re

leas

es

Prog

.Lan

guag

e

Subs

yste

m

File

Func

tion

Oth

er

Yuan; 2000 [2000] o o . . o Telecom ? ? High level o . . .Cartwright; 2000 [2000] o o . . o Telecom 1 C++ o . . .

Neufelder; 2000 [2000] o o . . o17 commercial orga-nizations

?C++; 2 un-kown

? ? ? .

Khoshgoftaar; 2000 [2000] o o . . o Telecom 4 Protel o . . .Khoshgoftaar; 2000 [2000] o o . . o Telecom 4 Protel o . . .

Morasca; 2000 [2000] o o . . o DATATRIEVE 2 BLISS o . . .

Wong; 2000 [2000] o o . . oTelecordia; client-server system

2C with embed-ded SQL

. o o .

Fenton; 2000 [2000] o o . . o Telecom 2 ? o . . .

Graves; 2000 [2000] o . . . oTelephone switchingsystem

1 C o . . .

Khoshgoftaar; 2001 [2001] o o . . o ? ? C++ . o . .

El Emam; 2001 [2001] ? o . . oCommercial wordprocessor

2 Java . o . .

Denaro; 2002 [2002] o o . o . Apache Web 2 C . o . .

Briand; 2002 [2002] o o . . oOracle Xpose andJwriter

? Java . o . .

Khoshgoftaar;2002 [2002] o o . . o Telecom 2 Protel o . . .Quah; 2003 [2003] . o Code . o QUES ? ? . o . .

Khoshgoftaar; 2003 [2003] o o . . o Telecom 4 Protel o . . .Guo; 2003 [2003] ? ? Nasa data . o Nasa ? C o . . .

Amasaki; 2003 [2003] o o ? . o Retail system ? ? . . . projectSucci; 2003 [2003] o . . . o ? ? C++ . o . .Guo; 2004 [2004] ? ? Nasa data . o Nasa ? C; C++ o . . .

Li; 2004 [2004] ? o ? o oIBM OS and mid-dleware; OpenBSD;Tomcat

22 ? ? ? ? .

Ostrand; 2004 [2004] o . . . oAT&T inventory sys-tem

17 Java . o . .

Koru;2005 [2005] ? ? Nasa data . o Nasa ? C; C++ . o . .Gyimothy; 2005 [2005] o o . o . Mozilla 7 C++ . o . .

Mockus; 2005 [2005] o oCustomerinfo

. o Telecom ? C; C++ o . . .

Nagappan; 2005 [2005a] o o . . oWindows Server2003

? C; C++ o . . .

Hassan; 2005 [2005] o o . o .NetBSD; FreeBSD;OpenBSD; KDE;Koffice; Postgres

? C; C++ o . . .

Nagappan; 2005 [2005b] o o . . oWindows Server2003

? C; C++ o . . .

Tomaszewski;2006 [2006] ? ? ? . o Telecom 3?; OO lan-guage

o o . .

Pan; 2006 [2006] o . . o .Apache HTTP andLatex2rtf

? C; C++ . o o .

ACM Transactions on Computational Logic, Vol. V, No. N, December 2012.

A Survey on Software Defect Prediction · 3



Paper

Sour

ceC

ode

Bug

Oth

er

Ope

nSo

urce

Com

mer

cial

Proj

ectN

ame

#re

leas

es

Prog

.Lan

guag

e

Subs

yste

m

File

Func

tion

Oth

er

Nagappan; 2006 [2006] o o . . o

IE6; IIS W3 Servercore; Process Mes-saging Component;DirectX; NetMeeting

? C#; C++ o . . .

Zhou; 2006 [2006] ? ? ? . o Nasa ? C; C++ . o . .

Li; 2006 [2006] o o . . oABB monitoring sys-tem and controllermanagement

13(MS)and15(CM)

? C++ o . . .

Knab; 2006 [2006] o o . o . Mozilla web browser 7 C; C++ . o . .Arisholm; 2006 [2006] o o . . o Telecom ? Java . o . .

Tomaszewski;2007 [2007] ? ? ? . o Telecom 3?; OO lan-guage

o o . .

Ma; 2007 [2007] ? ? ? . o Nasa ? C; C++ o . . .Menzies; 2007 [2007] ? ? ? . o Nasa ? C; Java o . . .Olague; 2007 [2007] o o . o . Mozilla Rhino 6 Java . o . .

Bernstein; 2007 [2007] o o . o . Eclipse vary Java . o . .

Aversano; 2007 [2007] o . . o .JHotDraw and DNS-Java

? Java . o . .

Kim; 2007 [2007] o ? . o .

Apache HTTP;Subversion; Post-greSQL; Mozilla;Jedit; Columba;Eclipse

? C; C++; Java . o o .

Ratzinger; 2007 [2007] o o . o oHealth care; Ar-goUML and Spring(OSS)

? Java . o . .

Mizuno; 2007 [2007] o o . o .ArgoUML andEclipse BIRT

? Java . o . .

Weyuker; 2007 [2007] o o . . oAT&T inventory;provisioning andvoice response

17inv;9pro-vi-sion-ing

? . o . .

Zimmermann; 2007 [2007] o o . o . Eclipse 3 Java o o . .Moser; 2008 [2008] o o . o . Eclipse 3 Java . o . .Kamei; 2008 [2008] o o . o . Eclipse 2 Java ? ? ? .

Zimmermann; 2008 [2007] o o . . oWindows Server2003

? ? o . . .

Zimmermann; 2008 [2008] o o . . oWindows Server2003

? ? o . . .

Nagappan; 2008 [2008] o o . . o Windows Vista ? ? o . . .Zhang; 2008 [2008] . o . o . Eclipse ? Java o . . .Pinzger; 2008 [2008] o o . . o Windows Vista ? ? o . . .

Lessmann; 2008 [2008] ? ? ? . o Nasa ? ? o . . .Watanabe; 2008 [2008] o . . o . jEdit and Sakura 6 C++; Java . o . .





Paper

Sour

ceC

ode

Bug

Oth

er

Ope

nSo

urce

Com

mer

cial

Proj

ectN

ame

#re

leas

es

Prog

.Lan

guag

e

Subs

yste

m

File

Func

tion

Oth

er

Kim; 2008 [2008] o o . o .

Apache; Bugzilla;Columba; Gaim;Gforge; Jedit;Mozilla; EclipseJDT; Plone; Post-greSQL; Scarab;Subversion

?

C; C++; Java;Perl; Python;JavaScript;PHP andXML

. . o .

Jiang; 2008 [2008] ? ? ? . o Nasa ?C; C++; Java;Perl

o . . .

Weyuker; 2008 [2008] o o . . oBusiness mainte-nance system

61 ? . o . .

Jiang; 2008 [2008] ? ? ? . o Nasa ?C; C++; Java;Perl

o . . .

Tosun; 2008 [2008] ? ? ? . o Nasa ? ? o . . .

Koru; 2008 [2008] o . . o oKoffice; ACE; IBM-DB

? C++ . o . .

Menzies; 2008 [2008] ? ? ? . o Nasa ? C; C++; Java o . . .Layman; 2008 [2008] o o . . o ? ? C#; C++ o . . .Goronda; 2008 [2008] ? ? ? . o Nasa ? C o . . .

Vandecruys; 2008 [2008] ? ? ? . o Nasa ? ? o . . .

Elish; 2008 [2008] ? ? ? . o Nasa ? C; C++; Java o . . .

Ratzinger; 2008 [2008] o ? . o .ArgoUML; JbossCache; Liferay Por-tal; Spring; Xdoclet

? Java . o . .

Meneely; 2008 [2008] ?o ?o . . oNortel networkingsystem

3 ? . o . .

Wu; 2008 [2008] o o . . o SoftPM 4 Java o . . .Tarvo; 2008 [2008] o o . . o Microsoft ? ? . . . Changes

Holschuh; 2009 [2009] o o . . o SAP

6projects;3re-leaseseach

Java o o . .

Jia; 2009 [2009] o o . o o Eclipse and QMP 6 C; Java o o . .

Shin; 2009 [2009] o o . . oBusiness mainte-nance system

35 C; C++ . o . .

Mende; 2009 [2009] o o . . oAlcatel-LucentLambdaUnite

? C;C++ . o . .

Binkley; 2009 [2009] o o . o oMozilla web browserand MP

1 C; C++; Java o . . .

D’Ambros; 2009 [2009] o o o o .ArgoUML; JDTCore; Mylyn

? Java . o . .

Hassan; 2009 [2009] o o . o .NetBSD; FreeBSD;OpenBSD; KDE;Koffice; Postgres

? C; C++ o . . .





Paper

Sour

ceC

ode

Bug

Oth

er

Ope

nSo

urce

Com

mer

cial

Proj

ectN

ame

#re

leas

es

Prog

.Lan

guag

e

Subs

yste

m

File

Func

tion

Oth

er

Bird; 2009 [2009] o o . o o Vista and Eclipse

1Vista;6Eclipse

C; C++; C#;Java

o . . .

Ferzund; 2009 [2009] o o . o .

Apache; Epiphany;Evolution; Nautilus;PostgreSQL; Eclipse;Mozilla

? C; C++; Java . . . Hunks

Liu; 2010 [2010] ? ? ? . o Nasa ? C; C++; Java o . . .

Erika; 2010 [2010] ? ? ? ? ?ECS; BNS; CRS stu-dent projects

? Java . o . .

Zhou; 2010 [2010] o o . o . Eclipse 3 Java . o . .

Mockus; 2010 [2010] o o . . oAvaya switching sys-tem

? C; C++ . o . .

Kamei; 2010 [2010] o o . o .Eclipse Platform;JDT; PDE

9 Java o o . .

Meneely; 2010 [2010] o oVulnerabilityDB

o .Linux Kernal; PHP;Wireshark

? C; C++ . o . .

Nguyen; 2010 [2010] o o . o . Eclipse 3 Java o o . .Shihab; 2010 [2010] o o . o . Eclipse 3 Java . o . .

Menzies; 2010 [2010] ? ? ? . oNASA and Turkishcommercial systems

? C; C++; Java o . . .

Weyuker; 2010 [2010] o o . . oAT&T inventory;provisioning andvoice response

? ? . o . .

Mende; 2010 [2010] o o . o o NASA and Eclipse3forEclipse

? Java ?o . . .

Nugroho; 2010 [2010] o o . . o Healthcare system ? Java . o . .Song; 2011 [2011] ? ? ? . o Nasa . ? o . . .

Nguyen; 2011 [2011] o o . o . Eclipse JDT 1 Java . o . .

Mende; 2011 [2011] o o . . o Avionics and Nasa ?C; C++; Java;Perl

. o o .

Lee; 2011 [2011] o oMylyndata

o . Eclipse Bugzilla ? ? . o . Tasks

Shihab; 2011 [2011] o o . . oAvaya telephonyproject

5 C; C++ . o . .

D’Ambros; 2011 [, 2010] o o . o .

Eclipse JDT; PDE;Equinox framework;Mylyn; ApacheLucene

5 Java o o . .

Kpodjedo; 2011 [2011] o o . o .Rhino; ArgoUML;Eclipse

7Rhino;9Ar-goUML;3Eclipse

Java . o . .

Bird; 2011 [2011] o o . . o Windows Vista and 7 ? ? o . . .Meneely; 2011 [2011] o o . . o Cisco . C; C++; Java . o . .





Paper

Sour

ceC

ode

Bug

Oth

er

Ope

nSo

urce

Com

mer

cial

Proj

ectN

ame

#re

leas

es

Prog

.Lan

guag

e

Subs

yste

m

File

Func

tion

Oth

er

Giger; 2011 [2011] o o . o . Eclipse ? Java . o . .

Percentage of papers 75 69 - 37 69 18 Nasa; 19 Eclipse - 43 Java; 46C;C++ 49 51 5 -



2. METRICS

Table II shows the dependent and independent variables used prior SDP studies. In term of independent variables,we observe that 76% of studies use product metrics and 45% use process metrics. The large number of studiesusing product metrics maybe explained by the large number of studies that used OO metrics to predict metrics inthe early 2000s. On the other hand, recent studies seem to be trending towards using process metrics, especiallysince recent studies showed that process metrics are as good or better predictors than product metrics Moser etal. [2008].



Table II: Metrics Used in SDP Studies. An ‘o’ means applied, a ‘.’ means not applied and‘?’ means could not determine.

Independent Variables Dependent Variables

Paper

Prod

uct

Proc

ess

#of

met

rics

Oth

er

Pre

Post

Oth

er

Yuan; 2000 [2000] o o 10 . . o .Cartwright; 2000 [2000] o . 12 . . o .Neufelder; 2000 [2000] . o 14 . . . Defect density

Khoshgoftaar; 2000 [2000] o o 42 Execution . o .Khoshgoftaar; 2000 [2000] o o 42 Execution . o .

Morasca; 2000 [2000] o o 8 ModuleKnowlege . o .Wong; 2000 [2000] . . 5 Design . o .Fenton; 2000 [2000] o . 3 Design o o .Graves; 2000 [2000] o o 9 . . o .

Khoshgoftaar; 2001 [2001] o o 5 . . o .

El Emam; 2001 [2001] . . 26 OO design . o .Denaro; 2002 [2002] o . 38 . . o .Briand; 2002 [2002] o . 22 Polymorphism . o .

Khoshgoftaar;2002 [2002] o o 42 Execution . o .Quah; 2003 [2003] o . 14 OO desgin o o .

Khoshgoftaar; 2003 [2003] o o 42 Execution . o .Guo; 2003 [2003] o . 21 . . o .

Amasaki; 2003 [2003] o o 17 Effort; test items . .Faults in devel-opment phase

Succi; 2003 [2003] o . 7 . o . .Guo; 2004 [2004] o . 21 . . o .

Li; 2004 [2004] ? ? ? ? . o .Ostrand; 2004 [2004] o o 6 Prog. Language . . Pre+Post

Koru;2005 [2005] o . 31 . . o .Gyimothy; 2005 [2005] o . 8 . . . Pre+Post

Mockus; 2005 [2005] . . 8Deployment; us-age; platform; HWconfiguration

. o .

Nagappan; 2005 [2005a] . . ? 2PREfix and PRE-fast

. .o (pre-releasedefect density)

Hassan; 2005 [2005] . o 4 . o . .

Nagappan; 2005 [2005b] . . 8 Relative churn . .o (pre-releasedefect density)

Tomaszewski;2006 [2006] . o 3 . . o Fault densityPan; 2006 [2006] o . 31 Program slicing o . .

Nagappan; 2006 [2006] o . 18 . . o .Zhou; 2006 [2006] o . 6 Design . o .

Li; 2006 [2006] o o 47Deployment; us-age; platform; HWconfiguration

. o .

Knab; 2006 [2006] o o 16 . . . Defect density

Arisholm; 2006 [2006] o o 32Requirementchanges

. . Pre+Post

Tomaszewski;2007 [2007] o . 9 . . o Fault densityMa; 2007 [2007] o . 6 . . o .

Menzies; 2007 [2007] o . 38 . . oOlague; 2007 [2007] o . 18 . ? ? ?

Bernstein; 2007 [2007] o o 22 Temporal . o .





Paper

Prod

uct

Proc

ess

#of

met

rics

Oth

er

Pre

Post

Oth

er

Aversano; 2007 [2007] . . NAWeighted termvector

. .Bug introducingchange

Kim; 2007 [2007] . o 4Least recentlyused (LRU); LRUchange; LRU bug

o . .

Ratzinger; 2007 [2007] o o 17 Evolution ? ? Time basedMizuno; 2007 [2007] . . NA Code . . Pre+PostWeyuker; 2007 [2007] o o 8 Developer . o .

Zimmermann; 2007 [2007] o o 73 . . o .Moser; 2008 [2008] o o 49 . . o .Kamei; 2008 [2008] o . 15 . ? ? .

Zimmermann; 2008 [2007] . . 22 Dependency . o .Zimmermann; 2008 [2008] o . 46 . . o .

Nagappan; 2008 [2008] o o 28Churn; dependen-cies; code cover-age

. o .

Zhang; 2008 [2008] . . 1 # of defects . . Pre+Post

Pinzger; 2008 [2008] . o 7Developer net-works

. o .

Lessmann; 2008 [2008] o .13-37

. . o .

Watanabe; 2008 [2008] o . 63 . ? ? ?

Kim; 2008 [2008] o . 63+Terms fromchanges

. .Bug introducingchange

Jiang; 2008 [2008] o . 40 Design . o .Weyuker; 2008 [2008] ? ? ? ? . . Pre+Post

Jiang; 2008 [2008] ? ?21-40

. . o .

Tosun; 2008 [2008] o . ? . . o .

Koru; 2008 [2008] o . . . . . Pre+PostMenzies; 2008 [2008] o . ? . . o .Layman; 2008 [2008] o o 159 . . . Pre+PostGoronda; 2008 [2008] o . 21 . . o .

Vandecruys; 2008 [2008] o . 23 . . o .

Elish; 2008 [2008] o . 21 . . o .

Ratzinger; 2008 [2008] . . 110Refactoring andnon-refactoringfeatures

. .Pre+Post (time-based)

Meneely; 2008 [2008] o o 13 . . o .Wu; 2008 [2008] o . 6 . . o .

Tarvo; 2008 [2008] o o 29+ Dependency . . regression

Holschuh; 2009 [2009] o o 78Dependency; Codesmell

. o .

Jia; 2009 [2009] o o30-40

. . o .

Shin; 2009 [2009] o o 22 Calling structure . o .Mende; 2009 [2009] o o ? . . o .Binkley; 2009 [2009] o . 3 Natural language . o .

D’Ambros; 2009 [2009] o o 16 Change coupling . . Pre+Post





Paper

Prod

uct

Proc

ess

#of

met

rics

Oth

er

Pre

Post

Oth

er

Hassan; 2009 [2009] . o 5Change complex-ity

. .Pre+Post (time-based)

Bird; 2009 [2009] . o 24 Socio-technical . o .

Ferzund; 2009 [2009] . . 27 Hunk metrics . .Bug introducinghunk

Liu; 2010 [2010] o . 15 . . o .

Erika; 2010 [2010] o . ? UML ? ? ?Zhou; 2010 [2010] o o 10 . . o .

Mockus; 2010 [2010] o o 13 Geography . o .Kamei; 2010 [2010] o o 22 . . o .

Meneely; 2010 [2010] . o 4 Developer . . Vulnerabilities

Nguyen; 2010 [2010] o . 33 Dependency . o .Shihab; 2010 [2010] o o 34 . . o .

Menzies; 2010 [2010] o .21-39

. . o .

Weyuker; 2010 [2010] o o 7 Prog. Language . . Pre+PostMende; 2010 [2010] o ? ? . . o .

Nugroho; 2010 [2010] o . 3 UML . o .

Song; 2011 [2011] o .21-40

. . o .

Nguyen; 2011 [2011] o o ? Topic models . o .Mende; 2011 [2011] o . 17 . . . Pre+Post

Lee; 2011 [2011] o o 81 . . .Post (timebased)

Shihab; 2011 [2011] o o 16 Co-change; time . oBreakages andSurprise

D’Ambros; 2011 [, 2010] o o 44

entropy ofchanges; churnof source code andentropy of sourcecode

. o .

Kpodjedo; 2011 [2011] o .11-32

Design evolution . o .

Bird; 2011 [2011] o . 7 Ownership o o .Meneely; 2011 [2011] . . 4 Team expansion . . Failures per hour

Giger; 2011 [2011] . o 2Fine-grained codechanges

. . Pre+Post

Percentage of papers 76 45 - - 7 65 -



3. MODELS

Table III show the models used in prior SDP studies. In terms of tree-based models, we find that 26% of studiesuse decision trees, 15% use random forests, 3% use recursive partitioning and 2% use CART models.



Table III: Types of Models Used in SDP Studies. An ‘o’ means applied, a ‘.’ means notapplied and ‘?’ means could not determine.

Statistical Decision Tree-based

Paper

Nai

veB

ayes

MA

RS

Lin

ear

Reg

ress

ion

Log

istic

Reg

ress

ion

Dec

isio

nTr

ees

CA

RT

Ran

dom

Fore

sts

Rec

ursi

vePa

rtiti

onin

g

SVM

Oth

er

Yuan; 2000 [2000] . . . . . . . . . Fuzzy subtractive clusteringCartwright; 2000 [2000] . . o . . . . . . .Neufelder; 2000 [2000] . . . . . . . . . .

Khoshgoftaar; 2000 [2000] . . . . o . . . . .Khoshgoftaar; 2000 [2000] . . . . . o . . . .

Morasca; 2000 [2000] . . . o . . . . . Rough SetsWong; 2000 [2000] . . . . . . . . . ?Fenton; 2000 [2000] . . . . . . . . . .Graves; 2000 [2000] . . . o . . . . . GLM

Khoshgoftaar; 2001 [2001] . . o . . . .. . . Zero-Inflated Poission

El Emam; 2001 [2001] . . . o . . . . . .Denaro; 2002 [2002] . . . o . . . . . .Briand; 2002 [2002] . o . o . . . . . .

Khoshgoftaar;2002 [2002] . . . . o . . . . .Quah; 2003 [2003] . . . . . . . . . Neural Networks

Khoshgoftaar; 2003 [2003] . . o . o o . . .CART-LS; CART-LAD; S-PLUS;MLR; ANN and CBR

Guo; 2003 [2003] . . . o . . . . .Dempster Shafer belief networks;Discriminant analysis

Amasaki; 2003 [2003] . . . . . . . . . BBN

Succi; 2003 [2003] . . o . . . . . .NBR; zero-inflated NBR andPoisson regression

Guo; 2004 [2004] . . . o o . o . . Discriminant analysis

Li; 2004 [2004] . . . . . . . . .Exponential; Gamma; Power;Logarithmic and Weibull

Ostrand; 2004 [2004] . . . . . . . . . .Koru;2005 [2005] . . . . o . . . . J48 and Kstar

Gyimothy; 2005 [2005] . . o o o . . . . Neural NetworksMockus; 2005 [2005] . . . o . . . . . .

Nagappan; 2005 [2005a] . . o . . . . . . Discriminant AnalysisHassan; 2005 [2005] . . . . . . . . . Ranking

Nagappan; 2005 [2005b] . . o . . . . . . Discriminant AnalysisTomaszewski;2006 [2006] . . . . . . . . . Random vs. best model

Pan; 2006 [2006] . . . . . . . . . Bayesian Network

Nagappan; 2006 [2006] . . . o . . . . . .Zhou; 2006 [2006] o . . o . . o . . Nnge

Li; 2006 [2006] . . o o . . . . . .Knab; 2006 [2006] . . . . o . . . . .

Arisholm; 2006 [2006] . . . o . . . . . .

Tomaszewski;2007 [2007] . . . . . . . . . Random vs. best modelMa; 2007 [2007] o . . o o . . . . IB1 and Bagging

Menzies; 2007 [2007] o . . . o . . . . OneROlague; 2007 [2007] . . . o . . . . . .

Bernstein; 2007 [2007] . . o . o . . . . .

Aversano; 2007 [2007] . . . o o . . . o Multi-boosting and KNN





Paper

Nai

veB

ayes

MA

RS

Lin

ear

Reg

ress

ion

Log

istic

Reg

ress

ion

Dec

isio

nTr

ees

CA

RT

Ran

dom

Fore

sts

Rec

ursi

vePa

rtiti

onin

g

SVM

Oth

er

Kim; 2007 [2007] . . . . . . . . . .Ratzinger; 2007 [2007] . . o . . . . . . Genetic programmingMizuno; 2007 [2007] . . . . . . . . . Spam filterWeyuker; 2007 [2007] . . . o . . . . . NBR

Zimmermann; 2007 [2007] . . o o . . . . . .Moser; 2008 [2008] o . . o o . . . . .Kamei; 2008 [2008] . . . o o . . . . Linear discriminant

Zimmermann; 2008 [2007] . . o o . . . . . .Zimmermann; 2008 [2008] . . o o . . . . . .

Nagappan; 2008 [2008] . . . o . . . . . .Zhang; 2008 [2008] . . . . . . . . . PolynomialPinzger; 2008 [2008] . . o o . . . . . .

Lessmann; 2008 [2008] o . . o o o o . o

LDA; QDA; BayesNet; LARS;RVM; K-NN; K*; MLP; RBF net;L-SVM; LS-SVM; LP; VP; ADT;LMT

Watanabe; 2008 [2008] . . . . o . . . . .

Kim; 2008 [2008] . . . . . . . . o .Jiang; 2008 [2008] o . . o . . o . . Bagging; Boosting

Weyuker; 2008 [2008] . . . . . . . o . NBRJiang; 2008 [2008] o . . o . . o . . Boosting; bagging

Tosun; 2008 [2008] . . . . . . . . .Ensemble of naive bayes; neuralnetworks and voting feature inter-val

Koru; 2008 [2008] . . . . . . . . . CoxMenzies; 2008 [2008] o . . . o . . . . .Layman; 2008 [2008] . . . o . . . . . .Goronda; 2008 [2008] . . . . . . . . o ANN

Vandecruys; 2008 [2008] . . . o o . . . o RIPPER; 1-NN; majority vote

Elish; 2008 [2008] o . . o o . o . oKNN; Multi-layer perceptrons;radial basis function; BBN;

Ratzinger; 2008 [2008] . . . . o . . . .LMT; Repeated IncrementalPruning; Nnge

Meneely; 2008 [2008] . . o o . . . . . .Wu; 2008 [2008] . . . . . . . . . ?

Tarvo; 2008 [2008] . . . o . o . . . Multilayer perceptron

Holschuh; 2009 [2009] . . o . . . . . oJia; 2009 [2009] o . . . o . o . . IB1

Shin; 2009 [2009] . . . o . . . . . NBRMende; 2009 [2009] . . . . . . o . . .Binkley; 2009 [2009] . . . . . . . . . Linear mixed-effects

D’Ambros; 2009 [2009] . . o . . . . . . .Hassan; 2009 [2009] . . o . . . . . . .

Bird; 2009 [2009] . . . o . . . . . .Ferzund; 2009 [2009] . . . o . . o . . .





Paper

Nai

veB

ayes

MA

RS

Lin

ear

Reg

ress

ion

Log

istic

Reg

ress

ion

Dec

isio

nTr

ees

CA

RT

Ran

dom

Fore

sts

Rec

ursi

vePa

rtiti

onin

g

SVM

Oth

er

Liu; 2010 [2010] o . . o o . o . .

Jrip; DecisionTable; OneR;PART; Ibk; IB1; ADTree; Ridor;LWLStump; SM; Bagging; LOC;TD

Erika; 2010 [2010] ? . . o ? . . . ? ?

Zhou; 2010 [2010] o . . o . . . . .Neural Networks; Kstar; Adtree;No learner

Mockus; 2010 [2010] . . . o . . . . . .Kamei; 2010 [2010] . . . . . . o o . MASS

Meneely; 2010 [2010] . . . . . . . . . Bayesian Network

Nguyen; 2010 [2010] . . . o . . . . . .Shihab; 2010 [2010] . . . o . . . . . .

Menzies; 2010 [2010] o . . . o . . . . RIPPER

Weyuker; 2010 [2010] . . . . . . o o .Bayesian additive regressiontrees; NBR

Mende; 2010 [2010] . . . . . . o . . .

Nugroho; 2010 [2010] . . . o . . . . . .Song; 2011 [2011] o . . . o . . . . OneR

Nguyen; 2011 [2011] . . o . . . . . . .Mende; 2011 [2011] . . . . . . o . . .

Lee; 2011 [2011] . . . o o . . . . Bayesian Network

Shihab; 2011 [2011] . . . o . . . . . .D’Ambros; 2011 [, 2010] o . . o o . . . . .Kpodjedo; 2011 [2011] . . o o . . . . . .

Bird; 2011 [2011] . . o . . . . . . .Meneely; 2011 [2011] . . o . . . . . . .

Giger; 2011 [2011] o . . o o . o . o Exhaustive CHAID; Neural Nets

Percentage of papers 16 1 22 47 26 2 15 3 8 -



4. PERFORMANCE EVALUATION

Table IV shows the performance evaluation methods used in prior SDP studies. In terms of how SDP studiesare validated, we find that 30% of studies use 10-fold cross validation, 29% perform cross-release validation andonly 5% perform cross-project validation. The practice of splitting data into training and testing data is verycommon, and is performed in 77% of the surveyed papers.



Table IV: Performance Evaluation Measures Used in SDP Studies. An ‘o’ means applied,a ‘.’ means not applied and ‘?’ means could not determine.

Cross Validation Predictive Power Explanative Power

Paper

10-fo

ld

Cro

ss-p

roje

ct

Cro

ss-r

elea

se

Trai

ning

and

test

ing

data

Cor

rela

tion

Prec

isio

n

Rec

all

Acc

urac

y

F-m

easu

re

AU

C

Oth

er

R2

Dev

ianc

eE

xpla

ined

Yuan; 2000 [2000] . . ? o . . . . . .MCR; effectiveness; effi-ciency

. .

Cartwright; 2000 [2000] . . . ? o . . . . . . o .Neufelder; 2000 [2000] . . . . o . . . . . . . .

Khoshgoftaar; 2000 [2000] o . o o . . . . . . MCR . .Khoshgoftaar; 2000 [2000] o . o o . . . . . . MCR . .

Morasca; 2000 [2000] ? . o . . . . . . .Overall completeness;faulty module complete-ness and correctness

. o

Wong; 2000 [2000] ? . o o . o o . . . . . .Fenton; 2000 [2000] . . o . o . . . . . . . .Graves; 2000 [2000] . . . o . . . . . . ? . .

Khoshgoftaar; 2001 [2001] . . . o . . . . . . AAE; ARE . .

El Emam; 2001 [2001] . . o o . o o o . o . . oDenaro; 2002 [2002] . . o o . o o o . . . . o

Briand; 2002 [2002] o o . o . . . . . .Completeness; correct-ness; cost-effectiveness

. .

Khoshgoftaar;2002 [2002] . . o o . . . . . . MCR . .Quah; 2003 [2003] ? . . o . . . . . . Min/MSE; Min/AAE o .

Khoshgoftaar; 2003 [2003] o . o o . . . . . . AAE; ARE and ANOVA . .

Guo; 2003 [2003] o . . o . . . o . .Specificity; Sensitivity;PFA; Effort

. .

Amasaki; 2003 [2003] ? . . o . . . o . . Error rate . .

Succi; 2003 [2003] ? ? ? ? . .. . . . .Applicability; effective-ness and predictive ability

. .

Guo; 2004 [2004] ? . . o . . . o . .Specificity; Sensitivity;PFA; Effort

. .

Li; 2004 [2004] ? ? ? ? . . . . . . AIC . .Ostrand; 2004 [2004] . . o . . . . . . . % faults . .

Koru;2005 [2005] o . . o . o o . o . . . .

Gyimothy; 2005 [2005] . . . o . . . . . .Completeness and correct-ness

o o

Mockus; 2005 [2005] ? . o o . . . . . . . . o

Nagappan; 2005 [2005a] . . . o o . . . . . . o .

Hassan; 2005 [2005] . . . . . . . . . .Hit rate and avg. predictionage

. .

Nagappan; 2005 [2005b] . . . o o . . . . . . o .

Tomaszewski;2006 [2006] . . . . . . . . . .Compare to random andbest models

. .

Pan; 2006 [2006] o . . o . o o o . . . . .

Nagappan; 2006 [2006] . o . o o . . . . . . . oZhou; 2006 [2006] ? . . ? . o . . . . Correctness; completeness o o

Li; 2006 [2006] . . o . o . . . . . ARE .Knab; 2006 [2006] ? ? ? o o . . o . . Classification rates . .





Paper

10-fo

ld

Cro

ss-p

roje

ct

Cro

ss-r

elea

se

Trai

ning

and

test

ing

data

Cor

rela

tion

Prec

isio

n

Rec

all

Acc

urac

y

F-m

easu

re

AU

C

Oth

er

R2

Dev

ianc

eE

xpla

ined

Arisholm; 2006 [2006] o . o o . . . . . .False positives and falsenegatives

. .

Tomaszewski;2007 [2007] . . . . o . . . . .Compare to random andbest models

. .

Ma; 2007 [2007] o . . o o o . o o .Specificity; Sensitivity;PFA; G-mean

. .

Menzies; 2007 [2007] o . . o POD; PFA and ROC .Olague; 2007 [2007] ? . . o o . . o . . . . .

Bernstein; 2007 [2007] . . o o o . . o . o ROC; RMSE; AAE . .

Aversano; 2007 [2007] o . . o . o o . o . . . .Kim; 2007 [2007] ? . . ? . . . . . . Hit rate . .

Ratzinger; 2007 [2007] o . . o o . . . . . AAE and MSE . .Mizuno; 2007 [2007] . . . o . o o o . . . . .Weyuker; 2007 [2007] . . o ? . . . . . . % faults . .

Zimmermann; 2007 [2007] ? . o o o o o o . . . . .Moser; 2008 [2008] o . . o . o o o . . Cost . .Kamei; 2008 [2008] . . o o . o o . o . . . .

Zimmermann; 2008 [2007] . . . o o . . . . . .o (ontrainingmodels)

.

Zimmermann; 2008 [2008] . . . o o o o . . . .o (ontrainingmodels)

.

Nagappan; 2008 [2008] . . . o o o o . . . . . .Zhang; 2008 [2008] . . . . . . . . . . ARE . .

Pinzger; 2008 [2008] . . . o o o o . . o ROC o .Lessmann; 2008 [2008] . . . o . . . . . o . . .Watanabe; 2008 [2008] o o o o o o o . . . . . .

Kim; 2008 [2008] o . . o o o o o o . . . .Jiang; 2008 [2008] o . . o . . . . . o ROC . .

Weyuker; 2008 [2008] . . o o . . . . . . % faults . .Jiang; 2008 [2008] . . . ? . . . . . . Cost curves . .Tosun; 2008 [2008] ? . . o . . . . . . POD; PFA and balance . .

Koru; 2008 [2008] . . . . o . . . . . % faults . .Menzies; 2008 [2008] ? . . o . . . . . . POD; PFA and balance . .Layman; 2008 [2008] . . . o o . . o . . . . oGoronda; 2008 [2008] ? . . o . . . o . . . . .

Vandecruys; 2008 [2008] ? . . o . . . o . . Specificity; Sensitivity . .

Elish; 2008 [2008] o . . o . o o o o . . . .Ratzinger; 2008 [2008] ? . . o . o . . . . . . .Meneely; 2008 [2008] . . o o o . . . . . % faults . .

Wu; 2008 [2008] . . o o o . . . . . % faults . .Tarvo; 2008 [2008] ? . . o . o o . . o False positive rate; ROC . .

Holschuh; 2009 [2009] . . o o o o o . . . . o .





Paper

10-fo

ld

Cro

ss-p

roje

ct

Cro

ss-r

elea

se

Trai

ning

and

test

ing

data

Cor

rela

tion

Prec

isio

n

Rec

all

Acc

urac

y

F-m

easu

re

AU

C

Oth

er

R2

Dev

ianc

eE

xpla

ined

Jia; 2009 [2009] ? . ? ? . . . . . o . . .Shin; 2009 [2009] . . o o o . . o . . . . .

Mende; 2009 [2009] . . o o . o o o . oFalse positives and falsenegatives

. .

Binkley; 2009 [2009] ? . . o . . . . . . . o .

D’Ambros; 2009 [2009] . . . o o . . . . . . o .Hassan; 2009 [2009] . . . o . . . . . . Error o .

Bird; 2009 [2009] . . o o . o o . o o Nagelkerke coef . .Ferzund; 2009 [2009] o . . o . o o o o . . . .

Liu; 2010 [2010] ? . . o . . . . . . Type I; TypeII errors . .

Erika; 2010 [2010] . . . ? . . . . . .Specificity; Sensitivity;Correctness

. .

Zhou; 2010 [2010] o . o o o . . . . o . o .Mockus; 2010 [2010] . . . . . . . . . . Effect sizes . .Kamei; 2010 [2010] o . o o . . . . . . % faults per LOC . .

Meneely; 2010 [2010] o o . o o o o . o . Inspection rate . .

Nguyen; 2010 [2010] o . . o o o o . . . . o .Shihab; 2010 [2010] o . . o o o o . o . Odds ratios . o

Menzies; 2010 [2010] ? . ? ? . o . o . o POD; POA . .Weyuker; 2010 [2010] ? . . o o . . . . . % Faults in 20% files; FPA . .Mende; 2010 [2010] o . . ? . o o . . . % faults . .

Nugroho; 2010 [2010] o . . o . o o o . .Specificity; FP rate; FNrate; odds-ratios

. .

Song; 2011 [2011] o . . o . . . . . o ROC; PD; PF . .Nguyen; 2011 [2011] o . . o o . . . . . . o .

Mende; 2011 [2011] o o o o . . o . . oCost Effectiveness; DDR;avg. % faults

. .

Lee; 2011 [2011] o . . o o o o . o .AAE; root mean squarederror

. .

Shihab; 2011 [2011] o . . o o o o . . . . . oD’Ambros; 2011 [, 2010] ? . ? o o . . . . o . . .Kpodjedo; 2011 [2011] . . o o . . . . o . % faults o o

Bird; 2011 [2011] ? . . o o o o . o . . o .Meneely; 2011 [2011] ? . . . o . . . . . . o .

Giger; 2011 [2011] ? . . o o o o . . o . o

Percentage of papers 30 5 29 77 39 34 31 23 13 15 - 17 12



REFERENCES

AMASAKI, S., TAKAGI, Y., MIZUNO, O., AND KIKUNO, T. 2003. A bayesian belief network for assessing the likelihood of fault content.In Proceedings of the 14th International Symposium on Software Reliability Engineering. ISSRE ’03. 215–226.

ARISHOLM, E. AND BRIAND, L. C. 2006. Predicting fault-prone components in a java legacy system. In Proceedings of the 2006ACM/IEEE International Symposium on Empirical Software Engineering. ISESE ’06. 8–17.

AVERSANO, L., CERULO, L., AND DEL GROSSO, C. 2007. Learning from bug-introducing changes to prevent fault prone code. In NinthInternational Workshop on Principles of Software Evolution. IWPSE ’07. 19–26.

BERNSTEIN, A., EKANAYAKE, J., AND PINZGER, M. 2007. Improving defect prediction using temporal features and non linear models.In Ninth International Workshop on Principles of Software Evolution. IWPSE ’07. 11–18.

BINKLEY, D., FEILD, H., LAWRIE, D., AND PIGHIN, M. 2009. Increasing diversity: Natural language measures for software fault predic-tion. Journal of Systems and Software 82, 1793–1803.

BIRD, C., NAGAPPAN, N., GALL, H., MURPHY, B., AND DEVANBU, P. 2009. Putting it all together: Using socio-technical networks topredict failures. In Proceedings of the 2009 20th International Symposium on Software Reliability Engineering. ISSRE ’09. 109–119.

BIRD, C., NAGAPPAN, N., MURPHY, B., GALL, H., AND DEVANBU, P. 2011. Don’t touch my code!: examining the effects of ownershipon software quality. In Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of SoftwareEngineering. ESEC/FSE ’11. 4–14.

BRIAND, L. C., MELO, W. L., AND WUST, J. 2002. Assessing the applicability of fault-proneness models across object-oriented softwareprojects. IEEE Transactions on Software Engineering 28, 706–720.

CARTWRIGHT, M. AND SHEPPERD, M. 2000. An empirical investigation of an object-oriented software system. IEEE Transactions onSoftware Engineering 26, 8 (aug), 786 –796.

D’AMBROS, M., LANZA, M., AND ROBBES, R. Evaluating defect prediction approaches: a benchmark and an extensive comparison.Empirical Software Engineering, 1–47.

D’AMBROS, M., LANZA, M., AND ROBBES, R. 2009. On the relationship between change coupling and software defects. In Proceedingsof the 2009 16th Working Conference on Reverse Engineering. WCRE ’09. 135–144.

D’AMBROS, M., LANZA, M., AND ROBBES, R. 2010. An extensive comparison of bug prediction approaches. In Mining SoftwareRepositories (MSR), 2010 7th IEEE Working Conference on. 31 –41.

DENARO, G. AND PEZZE, M. 2002. An empirical evaluation of fault-proneness models. In Proceedings of the 24th International Confer-ence on Software Engineering. ICSE ’02. 241–251.

ELISH, K. O. AND ELISH, M. O. 2008. Predicting defect-prone software modules using support vector machines. Journal of Systems andSoftware 81, 649–660.

EMAM, K. E., MELO, W., AND MACHADO, J. C. 2001. The prediction of faulty classes using object-oriented design metrics. J. Syst.Softw. 56, 63–75.

ERIKA, A. AND CRUZ, C. 2010. Exploratory study of a uml metric for fault prediction. In Proceedings of the 32nd ACM/IEEE InternationalConference on Software Engineering - Volume 2. ICSE ’10. 361–364.

FENTON, N. E. AND NEIL, M. 1999. A critique of software defect prediction models. IEEE Transactions on Software Engineering 25,675–689.

FENTON, N. E. AND OHLSSON, N. 2000. Quantitative analysis of faults and failures in a complex software system. IEEE Transactions onSoftware Engineering 26, 797–814.

FERZUND, J., AHSAN, S. N., AND WOTAWA, F. 2009. Software change classification using hunk metrics. In International Conference onSoftware Maintenance. 471–474.

GIGER, E., PINZGER, M., AND GALL, H. C. 2011. Comparing fine-grained source code changes and code churn for bug prediction. InProceedings of the 8th Working Conference on Mining Software Repositories. MSR ’11. 83–92.

GONDRA, I. 2008. Applying machine learning to software fault-proneness prediction. Journal of Systems and Software 81, 186–195.GRAVES, T. L., KARR, A. F., MARRON, J. S., AND SIY, H. 2000. Predicting fault incidence using software change history. IEEE

Transactions on Software Engineering 26, 7 (July), 653–661.GUO, L., CUKIC, B., AND SINGH, H. 2003. Predicting fault prone modules by the dempster-shafer belief networks. In Automated Software

Engineering, 2003. Proceedings. 18th IEEE International Conference on. 249 – 252.GUO, L., MA, Y., CUKIC, B., AND SINGH, H. 2004. Robust prediction of fault-proneness by random forests. In Proceedings of the 15th

International Symposium on Software Reliability Engineering. 417–428.GYIMOTHY, T., FERENC, R., AND SIKET, I. 2005. Empirical validation of object-oriented metrics on open source software for fault

prediction. IEEE Transactions on Software Engineering 31, 897–910.HASSAN, A. E. 2009. Predicting faults using the complexity of code changes. In Proceedings of the 31st International Conference on

Software Engineering. ICSE ’09. 78–88.HASSAN, A. E. AND HOLT, R. C. 2005. The top ten list: Dynamic fault prediction. In Proceedings of the 21st IEEE International

Conference on Software Maintenance. 263–272.HOLSCHUH, T., PUSER, M., HERZIG, K., ZIMMERMANN, T., RAHUL, P., AND ZELLER, A. 2009. Predicting defects in sap java code:

An experience report. In International Conference on Software Engineering. 172–181.JIA, H., SHU, F., YANG, Y., AND LI, Q. 2009. Data transformation and attribute subset selection: Do they help make differences in

software failure prediction? In International Conference on Software Maintenance. 519–522.



JIANG, Y., CUKI, B., MENZIES, T., AND BARTLOW, N. 2008. Comparing design and code metrics for software quality prediction. InProceedings of the 4th International Workshop on Predictor Models in Software Engineering. PROMISE ’08. 11–18.

JIANG, Y., CUKIC, B., AND MENZIES, T. 2008. Cost curve evaluation of fault prediction models. In Proceedings of the 2008 19thInternational Symposium on Software Reliability Engineering. 197–206.

KAMEI, Y., MATSUMOTO, S., MONDEN, A., MATSUMOTO, K.-I., ADAMS, B., AND HASSAN, A. E. 2010. Revisiting common bugprediction findings using effort-aware models. In Proceedings of the 2010 IEEE International Conference on Software Maintenance.ICSM ’10. 1–10.

KAMEI, Y., MONDEN, A., MORISAKI, S., AND MATSUMOTO, K.-I. 2008. A hybrid faulty module prediction using association rule miningand logistic regression analysis. In Proceedings of the Second ACM-IEEE International Symposium on Empirical Software Engineeringand Measurement. ESEM ’08. 279–281.

KHOSHGOFTAAR, T., GAO, K., AND SZABO, R. M. 2001. An application of zero-inflated poisson regression for software fault prediction.In International Symposium on Software Reliability Engineering. ISSRE ’01. 66–73.

KHOSHGOFTAAR, T., SHAN, R., AND ALLEN, E. 2000. Improving tree-based models of software quality with principal componentsanalysis. In International Symposium on Software Reliability Engineering. 198 –209.

KHOSHGOFTAAR, T., THAKER, V., AND ALLEN, E. 2000. Modeling fault-prone modules of subsystems. In International Symposium onSoftware Reliability Engineering. 259 –267.

KHOSHGOFTAAR, T. M. AND SELIYA, N. 2003. Fault prediction modeling for software quality estimation: Comparing commonly usedtechniques. Empirical Software Engineering 8, 255–283.

KHOSHGOFTAAR, T. M., YUAN, X., ALLEN, E. B., JONES, W. D., AND HUDEPOHL, J. P. 2002. Uncertain classification of fault-pronesoftware modules. Empirical Software Engineering 7, 297–318.

KIM, S., WHITEHEAD, JR., E. J., AND ZHANG, Y. 2008. Classifying software changes: Clean or buggy? IEEE Transactions on SoftwareEngineering 34, 181–196.

KIM, S., ZIMMERMANN, T., WHITEHEAD JR., E. J., AND ZELLER, A. 2007. Predicting faults from cached history. In Proceedings of the29th International Conference on Software Engineering. ICSE ’07. 489–498.

KNAB, P., PINZGER, M., AND BERNSTEIN, A. 2006. Predicting defect densities in source code files with decision tree learners. InProceedings of the 2006 International Workshop on Mining Software Repositories. MSR ’06. 119–125.

KORU, A. G., EMAM, K. E., ZHANG, D., LIU, H., AND MATHEW, D. 2008. Theory of relative defect proneness. Empirical SoftwareEngineering 13, 473–498.

KORU, A. G. AND LIU, H. 2005. An investigation of the effect of module size on defect prediction using static measures. In Proceedingsof the 2005 Workshop on Predictor Models in Software Engineering. PROMISE ’05. 1–5.

KPODJEDO, S., RICCA, F., GALINIER, P., GUEHENEUC, Y.-G., AND ANTONIOL, G. 2011. Design evolution metrics for defect predictionin object oriented systems. Empirical Software Engineering 16, 141–175.

LAYMAN, L., KUDRJAVETS, G., AND NAGAPPAN, N. 2008. Iterative identification of fault-prone binaries using in-process metrics.In Proceedings of the Second ACM-IEEE International Symposium on Empirical Software Engineering and Measurement. ESEM ’08.206–212.

LEE, T., NAM, J., HAN, D., KIM, S., AND IN, H. P. 2011. Micro interaction metrics for defect prediction. In Proceedings of the 19thACM SIGSOFT Symposium and the 13th European conference on Foundations of Software Engineering. ESEC/FSE ’11. 311–321.

LESSMANN, S., BAESENS, B., MUES, C., AND PIETSCH, S. 2008. Benchmarking classification models for software defect prediction: Aproposed framework and novel findings. IEEE Transactions on Software Engineering 34, 485–496.

LI, P. L., HERBSLEB, J., SHAW, M., AND ROBINSON, B. 2006. Experiences and results from initiating field defect prediction and producttest prioritization efforts at abb inc. In Proceedings of the 28th International Conference on Software Engineering. ICSE ’06. 413–422.

LI, P. L., SHAW, M., HERBSLEB, J., RAY, B., AND SANTHANAM, P. 2004. Empirical evaluation of defect projection models for widely-deployed production software systems. In Proceedings of the 12th ACM SIGSOFT Twelfth International Symposium on Foundations ofSoftware Engineering. SIGSOFT ’04/FSE-12. 263–272.

LIU, Y., KHOSHGOFTAAR, T. M., AND SELIYA, N. 2010. Evolutionary optimization of software quality modeling with multiple reposito-ries. IEEE Transactions on Software Engineering 36, 852–864.

MA, Y. AND CUKIC, B. 2007. Adequate and precise evaluation of quality models in software engineering studies. In Proceedings of theThird International Workshop on Predictor Models in Software Engineering. PROMISE ’07. 1–9.

MENDE, T. AND KOSCHKE, R. 2010. Effort-aware defect prediction models. In Proceedings of the 2010 14th European Conference onSoftware Maintenance and Reengineering. CSMR ’10. 107–116.

MENDE, T., KOSCHKE, R., AND LESZAK, M. 2009. Evaluating defect prediction models for a large evolving software system. InProceedings of the 2009 European Conference on Software Maintenance and Reengineering. 247–250.

MENDE, T., KOSCHKE, R., AND PELESKA, J. 2011. On the utility of a defect prediction model during hw/sw integration testing: Aretrospective case study. In Proceedings of the 2011 15th European Conference on Software Maintenance and Reengineering. CSMR’11. 259–268.

MENEELY, A., ROTELLA, P., AND WILLIAMS, L. 2011. Does adding manpower also affect quality?: an empirical, longitudinal analysis.In Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering.ESEC/FSE ’11. 81–90.



MENEELY, A. AND WILLIAMS, L. 2010. Strengthening the empirical analysis of the relationship between linus’ law and software security.In Proceedings of the 2010 ACM-IEEE International Symposium on Empirical Software Engineering and Measurement. ESEM ’10.9:1–9:10.

MENEELY, A., WILLIAMS, L., SNIPES, W., AND OSBORNE, J. 2008. Predicting failures with developer networks and social networkanalysis. In Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of Software Engineering. SIGSOFT’08/FSE-16. 13–23.

MENZIES, T., GREENWALD, J., AND FRANK, A. 2007. Data mining static code attributes to learn defect predictors. IEEE Transactionson Software Engineering 33, 2–13.

MENZIES, T., MILTON, Z., TURHAN, B., CUKIC, B., JIANG, Y., AND BENER, A. 2010. Defect prediction from static code features:current results, limitations, new approaches. Automated Software Engineering 17, 375–407.

MENZIES, T., TURHAN, B., BENER, A., GAY, G., CUKIC, B., AND JIANG, Y. 2008. Implications of ceiling effects in defect predictors.In Proceedings of the 4th International Workshop on Predictor Models in Software Engineering. PROMISE ’08. 47–54.

MIZUNO, O., IKAMI, S., NAKAICHI, S., AND KIKUNO, T. 2007. Spam filter based approach for finding fault-prone software modules. InProceedings of the Fourth International Workshop on Mining Software Repositories. MSR ’07. 4–7.

MOCKUS, A. 2010. Organizational volatility and its effects on software defects. In Proceedings of the Eighteenth ACM SIGSOFT Interna-tional Symposium on Foundations of Software Engineering. FSE ’10. 117–126.

MOCKUS, A., ZHANG, P., AND LI, P. L. 2005. Predictors of customer perceived software quality. In Proceedings of the 27th InternationalConference on Software Engineering. ICSE ’05. 225–233.

MORASCA, S. AND RUHE, G. 2000. A hybrid approach to analyze empirical software engineering data and its application to predictmodule fault-proneness in maintenance. Journal of Systems and Software 53, 225–237.

MOSER, R., PEDRYCZ, W., AND SUCCI, G. 2008. A comparative analysis of the efficiency of change metrics and static code attributes fordefect prediction. In Proceedings of the 30th International Conference on Software Engineering. ICSE ’08. 181–190.

NAGAPPAN, N. AND BALL, T. 2005a. Static analysis tools as early indicators of pre-release defect density. In International Conference onSoftware Engineering(ICSE’05). 580–586.

NAGAPPAN, N. AND BALL, T. 2005b. Use of relative code churn measures to predict system defect density. In International Conferenceon Software Engineering (ICSE’05). 284–292.

NAGAPPAN, N., BALL, T., AND ZELLER, A. 2006. Mining metrics to predict component failures. In International Conference on SoftwareEngineering (ICSE’06). 452–461.

NAGAPPAN, N., MURPHY, B., AND BASILI, V. 2008. The influence of organizational structure on software quality. In ACM/IEEE 30thInternational Conference on Software Engineering, 2008. 521 –530.

NEUFELDER, A. M. 2000. How to measure the impact of specific development practices on fielded defect density. In InternationalSymposium on Software Reliability Engineering. ISSRE ’00. 148–159.

NGUYEN, T. H. D., ADAMS, B., AND HASSAN, A. E. 2010. Studying the impact of dependency network measures on software quality.In Proceedings of the 2010 IEEE International Conference on Software Maintenance. ICSM ’10. 1–10.

NGUYEN, T. T., NGUYEN, T. N., AND PHUONG, T. M. 2011. Topic-based defect prediction (nier track). In Proceedings of the 33rdInternational Conference on Software Engineering. ICSE ’11. 932–935.

NUGROHO, A., CHAUDRON, M., AND ARISHOLM, E. 2010. Assessing uml design metrics for predicting fault-prone classes in a javasystem. In Mining Software Repositories (MSR), 2010 7th IEEE Working Conference on. 21 –30.

OLAGUE, H. M., ETZKORN, L. H., GHOLSTON, S., AND QUATTLEBAUM, S. 2007. Empirical validation of three software metrics suitesto predict fault-proneness of object-oriented classes developed using highly iterative or agile software development processes. IEEETransactions on Software Engineering 33, 402–419.

OSTRAND, T. J., WEYUKER, E. J., AND BELL, R. M. 2004. Where the bugs are. In Proceedings of the 2004 ACM SIGSOFT InternationalSymposium on Software Testing and Analysis. ISSTA ’04. 86–96.

PAN, K., KIM, S., AND WHITEHEAD, JR., E. J. 2006. Bug classification using program slicing metrics. In Proceedings of the Sixth IEEEInternational Workshop on Source Code Analysis and Manipulation. 31–42.

PINZGER, M., NAGAPPAN, N., AND MURPHY, B. 2008. Can developer-module networks predict failures? In Proceedings of the 16thACM SIGSOFT International Symposium on Foundations of Software Engineering. SIGSOFT ’08/FSE-16. 2–12.

QUAH, T.-S. AND THWIN, M. M. T. 2003. Application of neural networks for software quality prediction using object-oriented metrics.In Proceedings of the International Conference on Software Maintenance. ICSM ’03. 116–126.

RATZINGER, J., GALL, H., AND PINZGER, M. 2007. Quality assessment based on attribute series of software evolution. In Proceedingsof the 14th Working Conference on Reverse Engineering. 80–89.

RATZINGER, J., SIGMUND, T., AND GALL, H. C. 2008. On the relation of refactorings and software defect prediction. In Proceedings ofthe 2008 International Working Conference on Mining Software Repositories. MSR ’08. 35–38.

SHIHAB, E., JIANG, Z. M., IBRAHIM, W. M., ADAMS, B., AND HASSAN, A. E. 2010. Understanding the impact of code and processmetrics on post-release defects: a case study on the eclipse project. In Proceedings of the 2010 ACM-IEEE International Symposium onEmpirical Software Engineering and Measurement. ESEM ’10. 4:1–4:10.

SHIHAB, E., MOCKUS, A., KAMEI, Y., ADAMS, B., AND HASSAN, A. E. 2011. High-impact defects: a study of breakage and surprise de-fects. In Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering.ESEC/FSE ’11. 300–310.



SHIN, Y., BELL, R., OSTRAND, T., AND WEYUKER, E. 2009. Does calling structure information improve the accuracy of fault prediction?In Proceedings of the 2009 6th IEEE International Working Conference on Mining Software Repositories. MSR ’09. 61–70.

SONG, Q., JIA, Z., SHEPPERD, M., YING, S., AND LIU, J. 2011. A general software defect-proneness prediction framework. IEEETransactions on Software Engineering 37, 356–370.

SUCCI, G., PEDRYCZ, W., STEFANOVIC, M., AND MILLER, J. 2003. Practical assessment of the models for identification of defect-proneclasses in object-oriented commercial systems using design metrics. Journal of Systems and Software 65, 1–12.

TARVO, A. 2008. Using statistical models to predict software regressions. In Proceedings of the 2008 19th International Symposium onSoftware Reliability Engineering. 259–264.

TOMASZEWSKI, P., GRAHN, H., AND LUNDBERG, L. 2006. A method for an accurate early prediction of faults in modified classes. InProceedings of the 22nd IEEE International Conference on Software Maintenance. 487–496.

TOMASZEWSKI, P., HAKANSSON, J., GRAHN, H., AND LUNDBERG, L. 2007. Statistical models vs. expert estimation for fault predictionin modified code - an industrial case study. Journal of Systems and Software 80, 1227–1238.

TOSUN, A., TURHAN, B., AND BENER, A. 2008. Ensemble of software defect predictors: a case study. In Proceedings of the SecondACM-IEEE International Symposium on Empirical Software Engineering and Measurement. ESEM ’08. 318–320.

VANDECRUYS, O., MARTENS, D., BAESENS, B., MUES, C., DE BACKER, M., AND HAESEN, R. 2008. Mining software repositories forcomprehensible software fault prediction models. Journal of Systems and Software 81, 823–839.

WATANABE, S., KAIYA, H., AND KAIJIRI, K. 2008. Adapting a fault prediction model to allow inter language reuse. In Proceedings ofthe 4th International Workshop on Predictor Models in Software Engineering. PROMISE ’08. 19–24.

WEYUKER, E. J., OSTRAND, T. J., AND BELL, R. M. 2007. Using developer information as a factor for fault prediction. In Proceedingsof the Third International Workshop on Predictor Models in Software Engineering. PROMISE ’07. 8–14.

WEYUKER, E. J., OSTRAND, T. J., AND BELL, R. M. 2008. Comparing negative binomial and recursive partitioning models for faultprediction. In Proceedings of the 4th International Workshop on Predictor Models in Software Engineering. PROMISE ’08. 3–10.

WEYUKER, E. J., OSTRAND, T. J., AND BELL, R. M. 2010. Comparing the effectiveness of several modeling methods for fault prediction.Empirical Software Engineering 15, 277–295.

WONG, W. E., HORGAN, J. R., SYRING, M., ZAGE, W., AND ZAGE, D. 2000. Applying design metrics to predict fault-proneness: a casestudy on a large-scale software system. Software: Practice and Experience 30, 14.

WU, S., WANG, Q., AND YANG, Y. 2008. Quantitative analysis of faults and failures with multiple releases of softpm. In Proceedings ofthe Second ACM-IEEE International Symposium on Empirical Software Engineering and Measurement. ESEM ’08. 198–205.

YUAN, X., KHOSHGOFTAAR, T., ALLEN, E., AND GANESAN, K. 2000. An application of fuzzy clustering to software quality prediction.In IEEE Symposium on Application-Specific Systems and Software Engineering Technology, 2000. 85 –90.

ZHANG, H. 2008. An initial study of the growth of eclipse defects. In Proceedings of the 2008 International Working Conference on MiningSoftware Repositories. MSR ’08. 141–144.

ZHOU, Y. AND LEUNG, H. 2006. Empirical analysis of object-oriented design metrics for predicting high and low severity faults. IEEETransactions on Software Engineering 32, 771–789.

ZHOU, Y., XU, B., AND LEUNG, H. 2010. On the ability of complexity metrics to predict fault-prone classes in object-oriented systems.Journal of Systems and Software 83, 660–674.

ZIMMERMANN, T. AND NAGAPPAN, N. 2007. Predicting subsystem failures using dependency graph complexities. In InternationalSymposium on Software Reliability, 2007. 227 –236.

ZIMMERMANN, T. AND NAGAPPAN, N. 2008. Predicting defects using network analysis on dependency graphs. In Proceedings of the30th International Conference on Software Engineering. ICSE ’08. 531–540.

ZIMMERMANN, T., PREMRAJ, R., AND ZELLER, A. 2007. Predicting defects for Eclipse. In PROMISE ’07: Proceedings of the ThirdInternational Workshop on Predictor Models in Software Engineering. 1–7.


Documents

A Survey on Software Defect Prediction: Supporting Tablesemad/pubs/sdptables.pdfusing product metrics maybe explained by the large number of studies that used OO metrics to predict