Patterns for Extracting High Level Information from Bug Reports

Preview:

DESCRIPTION

Paper at https://github.com/rodrigorgs/dapse13-analysis/blob/master/preprint/icsews13dapse-id11-p-16145-preprint.pdf?raw=true

Citation preview

P!""#r$% f&r '"r!(")$* H)*+ L#v#, I$f&r-!")&$ fr&- B.* R#p&r"%

Rodrigo Souza1,*

Christina Chavez1

Roberto Bittencourt2

1 Federal University of Bahia, Brazil 2 State University of Feira de Santana, Brazil

DAPSE’13: International Workshop on Data Analysis Patterns in Software Engineering

* speaker; email: rodrigo@dcc.ufba.br

May 21, 2013 San Francisco, USA

Bug reports  

provide insight about… - the quality of the software - the quality of the process

Bug reports  

are like oysters… Bug reports  

If you look inside, you may find something valuable

I$ /)% T!,0 Two patterns to help you extract information about the software verification process

1. Fixers and Verifiers

2. Testing Phases

they’re like recipes for data scientists

12#r% !$3 V#r)4#r% Find the quality engineering team (if it exists).

1. Context 2. Problem 3. Solution 4. Discussion

12#r% !$3 V#r)4#r%

Developers assume specific roles in a team fixer: fixes bugs verifier: verifies if fixes are appropriate

A quality engineering team is formed by verifiers, who perform most of the verifications in the project (among other activities)

The roles should be taken into account in data analysis

You can’t judge a verifier by the number of fixes

1. Context 2. Problem 3. Solution 4. Discussion

12#r% !$3 V#r)4#r%

Find the quality engineering team (if it exists)

1. Context 2. Problem 3. Solution 4. Discussion

12#r% !$3 V#r)4#r%

You’ll need: -  For each developer

-  Number of times he changed status to VERIFIED (i.e., verifications)

-  Number of times he changed resolution to FIXED (i.e., fixes)

I$*r#3)#$"%

1

D)r#(")&$%

For each developer, compute the ratio: verifications / (1 + fixes)

2 Choose a threshold and assume that a developer is a verifier if

ratio > threshold how?  

D)r#(")&$%

2.1 For each ratio, use it as the threshold and compute: - the number of verifiers in the project - the % of verifications performed by verifiers

2.2 Plot this data

x y

D)r#(")&$% 90%  

80%  

70%  

60%  

50%  

40%  

30%  2%   4%   6%   8%   10%   12%   14%  

1  5  10  15  

20  25,  30  35  

40  

How to choose a threshold?

size of QE team = number of verifiers (%)  

% of verifications by verifiers  

ratio (threshold candidate)

=  

D)r#(")&$% 1  

5  10  15  

20  25,  30  35  

40  

fit an arm, find the elbow!

% of verifications by verifiers  

2%   4%   6%   8%   10%   12%  

90%  

80%  

70%  

60%  

50%  

40%  

30%  

2.3

size of QE team = number of verifiers (%)  

3

D)r#(")&$%

If % of verifications by verifiers is high*, they form a quality engineering team.

* e.g., > 50% 84%  

1. Context 2. Problem 3. Solution 4. Discussion

12#r% !$3 V#r)4#r%

Don’t use the absolute number of verifications, because developers may fix & verify simple bugs

If developers are expected to change roles over time, use sliding windows.

T#%")$* P+!%# Identify testing phases in the software development life cycle.

1. Context 2. Problem 3. Solution 4. Discussion

T#%")$* P+!%#

In mature projects, new features and bug fixes are verified before being released to the public

When are bugs verified?

Fix  Verify  Fix  

Verify  Fix  

Verify  

Fix  Fix  Fix  

Verify  Verify  Verify  

testing phase

Failing to recognize testing phases can mislead your analyses

Fix  Verify  

Fix  bug  #5  

Verify  bug  #5  Fix  

Verify  

time  …  

time ~ complexity  

Fix  bug  #7  Fix  Fix  

Verify  bug  #7  Verify  Verify  

…   time  

time ~ …?  

1. Context 2. Problem 3. Solution 4. Discussion

T#%")$* P+!%#

Identify testing phases in the software development life cycle

1. Context 2. Problem 3. Solution 4. Discussion

T#%")$* P+!%#

You’ll need: -  Time of verifications

-  Release dates (optional)

I$*r#3)#$"%

Plot the accum. number of verifications over time 1

D)r#(")&$% (%&,.")&$ #1)

time  

accum. num. verif.  

If possible, highlight release dates 2

D)r#(")&$% (%&,.")&$ #1)

time  

accum. num. verif.  

Find cliffs, especially before release dates (they are testing phases)

3

D)r#(")&$% (%&,.")&$ #1)

time  

accum. num. verif.  

Apply Kleinberg’s algorithm to verification times in order to detect verification bursts

1

D)r#(")&$% (%&,.")&$ #2)

Bursts

There’s no 2. 2

D)r#(")&$% (%&,.")&$ #2)

Bursts (= testing phases)

1. Context 2. Problem 3. Solution 4. Discussion

T#%")$* P+!%#

If the number of verifications on a particular day is too high, they may be mass updates

Look Out For Mass Updates and remove them beforing looking for testing phases

Testing phases are less common in projects with quality engineering teams

Thank you!

Go beyond the surface to find pearls in bug reports!

Recommended