1 Improving Cluster Selection Techniques of Regression Testing by Slice Filtering Yongwei Duan, Zhenyu Chen, Zhihong Zhao, Ju Qian and Zhongjun Yang Software

1

Improving Cluster Selection Techniques of Regression Testing

by Slice Filtering

Yongwei Duan, Zhenyu Chen, Zhihong Zhao, Ju Qian and Zhongjun Yang

Software Institute, Nanjing University, Nanjing, Chinahttp://software.nju.edu.cn/zychen

2

Outline• Introduction

• Our Approach

• Experiment and Evaluation

• Future Work

3

Introduction• Test selection techniques

• Cluster selection techniques

• Problems

4

Test selection techniques• Rerunning all of the existing test

cases is costly in regression testing

• Test selection techniques : choose a subset of test cases to rerun

5

Cluster Selection

Run Test Cases CollectionExecution Profiles (Basic block level)

Clusters ofTest Cases

A reduced test suite

Cluster selection overview

Clustering

Sampling

Problems• Too much data to cluster

– Huge amount of execution traces– Always a high dimension

6

Just focus on the code fragments that are actually relevant to the program

modification!!!

Our approach• Overview• Slice filtering• Clustering analysis• Sampling

7

Our approach• Overview

8

Running test cases

Execution traces

Trace filteringtraces

Cluster analysis clusters

Reduced test suite sampling

Slice filtering

• The execution traces are too detailed to be used in clustering analysis

• We use program slice to filter out fragments that are irrelevant to program modification.

9

Slice filtering cont’d

• Statement 2 is changed from ‘if(m<n)’ to ‘if(m<=n)’

• We compute a program slice with respect to statement 2 and intersect it with each execution trace.

• Given 3 test cases, we compare their execution traces and filtered execution traces.

10

if(m<=n){

11

Slice filtering cont’d

Test cases

Input Execution trace(Statement no.)

Statement no. by filteringm n

t1 1 0 1,2,4,5,6,7,8,9,10,11,12,13,14

2,4,5,6,7,8

t2 -1 0 1,2,3,5,6,7,8,9,10,11,12,13,14

2,3,5,6,7,8

t3 -1 1 1,2,3,5,6,7,8,9 2,3,5,6,7,8

• Execution traces are much smaller after program slice filtering.

• Traces of t2 and t3 are the same by filtering while the difference between t1 and t2 is magnified.

• To condense the traces further, adjacent statements within a basic block is combined into one statement.

• Patterns are easy to reveal with simple execution traces.

12

Slice filtering cont’d• But the amount of test cases is still

large. • If a trace is too small (below a

threshold) after intersection with the program slice, it is unlikely to be a fault-revealing test case, so we remove it from the test suite.

13

Slice filtering cont’d• Filtering rate

– We define filtering rate FR as: if the threshold is M and the size of the program slice is N, then the filtering rate FR = M / N * 100%.

– When FR gets lower, the effect of filtering diminishes i.e. fewer features can be eliminated.

14

Slice filtering cont’d• Why not just use Dynamic slicing

– The computing of dynamic slicing is complex and time consuming

– Effective dynamic slicing tools are hard to come by

15

Clustering analysis

•Distance measure– For a filtered trace fi = <ai1,ai2,…,ain>,

where aij is the execution count of a basic block. The distance between two filtered trace fi and fj is:

m

k jkikji aaffD1

2)(),(

16

Sampling

•We use adaptive sampling in our approach

– We first sample a certain number of test cases. If a test case is fault-revealing, the entire cluster from which the test cases are sampled is selected. This strategy favors small clusters and has high probability to select fault-revealing test cases.

17

Experiment & Evaluation

• Subject program– space, from SIR(Software-artifact

Infrastructure Repository )– 5902 LOC– 1533 basic-blocks– 38 modified versions (a real fault is

augmented for each version )– 13585 test cases

18


• Subject program• Measurements• Experimental results• Observations

19


• 3 measurements– Precision– Reduction– Recall

2020


• Precision– if in a certain run the technique selects a

subset of N test cases, in which M test cases are fault-revealing. The precision of the technique is: M / N * 100%.

– Precision measures the extent to which a selection method omits non-fault-revealing test cases in a run

2121


• Reduction– if a selection technique selects M test cases

out of all N existing test cases in a certain run, the reduction of the technique is: M / N * 100%.

– Reduction measures the extent to which a technique can reduce the size of the original test suite.

– A low reduction means a selection technique greatly reduce the original test suite.

2222

Experiment & Evaluation• Recall

– if a selection technique selects M fault-revealing test cases out of N existing fault-revealing test cases in a certain run, the recall of the technique is: M / N * 100%.

– Recall measures the extent to which a selection technique can include fault-revealing test cases.

– Recall indicates the fault detecting capability of a technique. A safe selection technique achieves 100% recall.

2323

Experiment & Evaluation• Experimental results

– A comparison between our approach and Dejavu. Dejavu is known as an effective algorithm in its high precision of test selection.

– A comparison between 2 different filtering rate: FR = 0.3 and FR = 0.5

24


24

Comparison of precision between our approach when FR=0.3 and Dejavu

25


25

Comparison of reduction between our approach when FR=0.3 and Dejavu

26


26

Comparison of recall between our approach when FR=0.3 and Dejavu

We achieve certain improvement except version 13, 25, 26, 35, 37, 38.


• Analysis– The key to our approach is to isolate the fault-

revealing test cases into small clusters– Failures detected on version 13, 25, 26, 35, 37, 38

are mostly memory access violation failures. Those failures cause premature termination of the execution flows.

– Program slicing cannot predict runtime execution flow changes and therefore cannot provide enough information to differentiate these test cases and lump them into different clusters.

2727

28


28

Comparison of precision between FR=0.3 and FR=0.5

29


29

Comparison of reduction between FR=0.3 and FR=0.5

30


30Comparison of recall between FR=0.3 and FR=0.5

If we raise FR to 0.5, certain improvement on precision, reduction and recall can be achieved


• Observations– for most versions, our approach has

higher precision and lower reduction (lower is better) than Dejavu. It means that we can select fault-revealing test cases from the original test suite and select relatively few non-fault-revealing test cases

3131


• Observations– the effectiveness of our approach

depends largely on the level of isolations of fault-revealing test cases. By choosing appropriate parameters such as filtering rate, sampling rate, initial cluster number etc., we can enhance the level of isolation.

3232

Future work

• We will try to answer the following questions in our future work– How do distance metrics and cluster

algorithms affect the result of cluster selection techniques?

– Given a program, how to find the best filtering rate and other parameters?

3333

34

Q & A

Documents

1 Improving Cluster Selection Techniques of Regression Testing by Slice Filtering Yongwei Duan, Zhenyu Chen, Zhihong Zhao, Ju Qian and Zhongjun Yang Software