[IEEE 2012 CSI Sixth International Conference on Software Engineering (CONSEG) - Indore, Madhay Pradesh, India (2012.09.5-2012.09.7)] 2012 CSI Sixth International Conference on Software

Software Reliability Growth Model with Testing Effort using Learning Function

Sunil Kumar Khatri

Amity Institute of Info. Technology

Amity University, Noida, India [email protected]

Deepak Kumar Amity Institute of Info.

Technology Amity University, Noida, India [email protected]

m

Asit Dwivedi Amity Institue of Biotech.

Amity University, Noida, India [email protected]

Nitika Mrinal Tata Consultancy Services 7th Floor, Plot No. 20 & 21 Sector 135, Noida, India

[email protected]

Abstract – Software Reliability Growth Models have been proposed in the literature to measure the quality of software and to release the software at minimum cost. Testing is an important part to find out faults during Software Development Life Cycle of integrated software. Testing can be defined as the execution of a program to find a fault which might have been introduced during the testing time under different assumptions. The testing team may not be able to remove the fault perfectly on the detection of the failure and the original fault may remain or get replaced by another fault. While the former phenomenon is known as imperfect fault removal, the latter is called error generation. In this paper, we have proposed a new SRGM with two types of imperfect debugging with testing effort using learning function reflecting the expertise gained by testing team depending on its complexity, the skills of the debugging team, the available manpower and the software development environment and it is estimated and compared other existing models on real time data sets. These estimation result shows compare performance and application of different SRGM with testing effort. Keywords - software reliability growth model (SRGM), testing effort, Non-Homogeneous Poisson Process (NHPP), testing phase, fault, imperfect debugging, error generation.

I. INTRODUCTION The NASA Software Assurance Standard, NASA-STD-8739.8, defines software reliability as a discipline of software assurance that measure and analysis defects and derives the reliability and maintainability factors. Software Reliability Growth Models measures and predicts the improvement of reliability through the testing process. Reliability growth model also represents the reliability or failure rate of a system as a function of time or the number of test cases. Goel and Okumoto [3] proposed Non Homogeneous Poisson Process (NHPP) based SRGM. In many software development projects it is observed that the relation between the mean value numbers of errors removed and the testing time is S-shaped. Yamada et al.[21] described the fault removal in two stage process: Failure observation and corresponding removal of cause for failure. Other S-shaped models have been discussed

by author Ohba [16], Kapur et al. [8 and 9], Khatri et al. [10] and Kumar [11]. The concept of imperfect debugging was studied by many authors but most of them considered only error generation as imperfect debugging. Kapur & Garg [7] proposed a SRGM incorporating the effect of imperfect fault debugging. They assumed that the cause of the failure might not be properly fixed during a removal attempt as discussed earlier resulting in number of failures more than the removal attempts made. Since the failure number is more than the removals, the two phenomena are studied separately. Ohba & Chou [17] introduced the effect of error generation into reliability modeling. Later based on their model, other researchers in the field of reliability modeling further studied the effect of error generation [9, 10, 11]. All these models have been named as Imperfect Debugging models. This is a common misconception in the study of reliability engineering. Most of the existing research doesn’t clearly distinguish between the two types of imperfect debugging leading to confusion in appropriate insight into the topic. In this paper, section II describes notation and assumptions for development of software reliability growth model with imperfect debugging and fault generation using learning function. In section III, we discuss several testing effort function. In section IV, we discuss literature review of software reliability growth models with imperfect debugging developed under different set of assumptions by different researchers. Section V, discussed the proposed model with two types of imperfect debugging with testing effort using learning function. In section VI, we describe comparison criteria, real data set, description of tables, parameter results and goodness of fit curves. Section VII concludes the paper.

II. NOTATIONS AND ASSUMPTIONS

A. Notations m (t): Mean number of faults detected in the time interval(0,t] mr(t): Mean number of faults removed in the time interval(0,t] a :Initial number of faults lying dormant in the software (constant)

b(t) :Fault detection rate per remaining fault as a function of testing time b :Faults removal rate per remaining fault �, � :Constants in learning function. w(t) :testing effort at time t. W(t) :Cumulative testing effort in the time interval(0,t),

( ) ( )d W t w tdt

=

B. Assumptions The model is based on the following assumptions. 1. Software is subject to failures at random times caused by errors remaining in the software. 2. The fault removal process follows NHPP. 3. The mean number of faults detected in the time interval ( )t t+ Δ by the current testing effort is proportional to the mean number of remaining faults in the system and proportionality is a constant. 4. Each time a failure is observed, an immediate effect take place to decide the cause of the failure and to remove it.

III. MODELING TESTING EFFORT

The testing effort (resources) that govern the pace of testing for almost all the software development is mainly consists of: (a)Manpower which includes (i)Failure identification personnel (ii)Failure correction personnel (b) Computer time To model the cumulative testing effort, various distribution Exponential, Rayleigh, Weibull and Logistic function have been discussed. The first two can be derived from the assumption that, “the testing effort rate is proportional to the testing resources available”.

( ) ( )[ ( )]dW t v t N W tdt

= − (3.1)

Where v(t) is the time dependent rate at which testing resources are consumed, with respect to remaining available resources. Solving equation (3.1) under the initial condition W(0)=0,we get: Case 1: When v(t)=v, a constant, we get exponential function:

( ) (1 )vtW t N e−= − (3.2) Case 2: If v(t)=v.t, we get Rayleigh type curve

( )2 /2( ) 1 vtW t N e−= − (3.3)

Case3: Yamada et al[23] described the time dependent behavior of testing effort expenditure by a Weibull curve while proposing a SRGM. If 1( ) . . lv t v l t −= ,we get Weibull function:

( )( ) 1lvtW t N e−= − (3.4)

Exponential and Rayleigh curves become special cases of the Weibull curve for l =1 and 2l = respectively. To study the testing effort process, one of the above functions can be selected. Case 4: Huang et al[4] developed an SRGM, based upon NHPP with a logistic testing effort function. If we define

( ) ( ) [ ( )]dW t W tv N W tdt N

= − (3.5)

On solving, the cumulative testing effort consumed in the interval (0,t) is given by,

( )1 vt

NW tle−=

+ (3.6)

Where (0)1

NWl

=+

Where N, v and l are constants. This is the Logistic testing effort function.

IV. LITERATURE REVIEW In this paper, we review the SRGM based upon Non Homogeneous Poisson Process (NHPP) under different set of assumptions. M1. Two stage delayed S-shaped with testing effort model. Besides assumptions 1-4, it is assumed that faults are perfectly removed. This model is due to Kapur et al. [6].This model is a modification of the NHPP to obtain an S-shaped curve for the cumulative number of faults detected such that the failure rate initially increases, and later decays. The software fault detection process described by such an S-shaped curve can be regarded as a learning process because the testers’ skills will gradually improve as time progresses. The following differential equations describes removal phenomenon under testing effort function of the model

( ) 1 ( )( ( ))( )

rr

dm t b t a m tdt w t

× = − (4.1)

Where, 2 ( )( )

1 ( )b W tb t

bW t

� �= � �� +� �

Solving the differential equation (4.1) under initial condition m(t=0)=0 and W(t=0)= 0 , we get mean value function as

( )( )( ) 1 (1 ( )) bW trm t a bW t e−� = − + ��

(4.2)

M2. Delayed S-shaped with testing effort under imperfect debugging model This model is due to Huang et al. [4]. This model includes time delay between detection and correction process during fault removal process with testing effort under imperfect debugging. The following differential equations describes removal phenomenon under testing effort function of the model

( ) 1 ( )( ( ) ( ))

( )r

rdm t b t a t m t

dt w t× = − (4.3)

Where, 2 ( )( )1 ( )b W tb t

bW t

� �= � �� +� �

and ( ) ( )ra t a m tγ= +

Solving the differential equation (4.3) under initial condition m(t=0)=0 and W(t=0)= 0 , we get mean value function as

( )(1 ) ( )( ) 1 (1 ( ))1

b W tr

am t bW t e γγ

− −� = − + �� − (4.4)

A SUMMARY OF NHPP MODELS WITH

TESTING EFFORTS Model Name Mean Value Function m(t) Comments

M1

1 (1 ( ))

( )exp( ( ))r

bW tm t a

bW t− +�

= �−�

( )a t a= and 2 ( )( )1 ( )b W tb t

bW t

� �= � �� +� �

Two stage phenomenon

can be derived in single stage

using non decreasing

fault removal rate per

additional fault

M2

1 (1 ( ))( )

exp( (1 ) ( ))(1 )rbW tam t

b W tγγ− +�

= �− −− �

( ) ( )ra t a m tγ= +

and 2 ( )( )

1 ( )b W tb t

bW t

� �= � �� +� �

Two stage phenomenon with constant

fault generation rate can be derived in

single stage using non decreasing

fault removal rate per

additional fault

V. MODELING SOFTWARE RELIABILITY GROWTH MODEL

Developer may gain the expertise depending in its complexity, skill of the debugging team, available manpower and the software development environment. This concept can be embedded using learning function in modeling of SRGM. Besides assumptions 1-4, we have assuming that fault removal rate per additional fault removed is reduced by the probability of perfect debugging and a constant proportion of faults are generated. The differential equation describing the removal phenomenon incorporating imperfect debugging and fault generation can be given by

( ) 1 ( ) ( ( ) ( ))( )

dm tr pb t a t m trdt w t× = − (5.1)

where ( )( )1 ( )

w tb tbw t

α β+=+

and ( ) ( )a t a m trγ= +

Above differential equation can be rewritten as

( )rr r

dm ( t ) 1 w( t )p ( a m ( t ) m ( t )dt w( t ) 1 bw( t )

α β γ+× = + −+

(5.2)

Solving the above differential equation (5.2) under initial condition m r(0)=0 , we get mean value function as

( ) p( 1 )( )2 bb

rp( 1 ) W ( t )

b

1 1 bW ( t )am ( t )( 1 ) e

β αγ

βγγ

− −

− −

� �− +� �= � �− � ��

(5.3)

It reduces to M1 model if p=1, � = 0, β = b2 and α = 0.

( )( )( ) 1 (1 ( )) bW trm t a bW t e−� = − + ��

, which is same as

equation (4.2). And gives M2 model if p=1, β = b2 and α = 0 is substituted in mr(t).

( )( ) 1 (1 ( )) exp( (1 ) ( ))(1 )r

am t bW t b W tγγ

= − + − −−

, which is

same as equation (4.4).

VI. PARAMETER ESTIMATION AND COMPARISON CRITERIA

The proposed model is non-linear and presents problems in estimating the parameters. Technically, it is more difficult to find the solution for non-linear models using Least Square method and requires numerical algorithms to solve it. SPSS (Statistical Package for Social Sciences) is used to solve this problem. A. Comparison Criteria: Goodness of Fit Criteria

In general, a model can be analyzed according to its ability to reproduce the observed behavior of the software. Some of the comparison criteria are:

� The Mean Square Fitting Error (MSE)[8] � Coefficient of Multiple Determination (R2)[8] � Variation[19] � Root Mean Square Prediction Error[19]

B. Data Analysis and Model Comparison To check the validity of the proposed model given

by equations (5.3) they have been tested on two real time data sets and results are compared with existing models. Data Set-1(DS-1) The first data set (DS-1) had been collected during 35 months of testing a radar system of size 124 KLOC and 1301 faults were detected during testing and 1846.92 CPU hours were consumed. This data is cited from Brooks and Motley [1].

Estimation results for the effort function and the proposed SRGM are given in table 1 and table 2 respectively. Weibull curve describes the effort data best. Parameter estimation of SRGM is performed w.r.t. Weibull effort function. Statistical parameters of the proposed models are estimated and compared with various models. The comparison criteria are shown in

table-3, it can be seen that the proposed models provide a better fit to this data set. The Fitting of the models is illustrated graphically in figure 1.

TABLE-1

DS-1 Estimates of Parameters Goodness of Fit Criteria

N v l R2 Exponential 1546.171 0.0011 - 0.9947

Rayleigh 2873.41 0.0017 - .99825 Weibull 2669.91 0.0008 2.0686 .99831 Logistic 2066.87 0.1616 38.6436 .99425

TABLE 2. PARAMETER ESTIMATION OF DS -1

Parameters Models M1 M2 Proposed Model

a 1298.63 1235.20 1495.01 b 0.0033 0.0034 0.0077 α - - 0.00086 β - - 0.0000104 p - - 0.98 � - 0.05 0.001

TABLE 3. COMPARISON CRITERIA FOR DS-1

Models R2 MSE SSE Variation RMSPE

M1 0.987 95089.4 2716.84 48.048 52.75

M2 0.987 93584.95 2673.856 47.684 52.33

Proposed Model 0.996 31775.06 907.859 30.306 30.56

FIGURE 1

Goodness of fit curves for DS-1

0

200

400

600

800

1000

1200

1400

1600

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35Time

Cum

mul

ativ

e nu

mbe

r of f

aults

Actual DataM1M2Proposed model

Data Set-2(DS-2) The data set was obtained from the paper by Ohba [16] for a PL/I database application software system consisting of approximately 1 317 000 LOC. Which was tested for 19 weeks, during testing 47.65 CPU

hours were consumed. 328 faults were removed during testing. The estimated values of parameters are given in Table 4 and Table 5 for the above data set. The comparison criterion with the existing models has been made in Table 6. The results of proposed model are found to be better than the existing models in terms of R2, MSE, SSE, Variation and RMSPE. The fitting of the models is illustrated graphically in Figure 2. It shows that the proposed model fits better.

TABLE-4

DS-2 Estimates of Parameters Goodness of Fit Criteria

N v l R2 Exponential 3770.89 0.0007 - 0.9917 Rayleigh 49.32 0.0137 - 0.9743 Weibull 799.22 0.0023 1.1146 0.9957 Logistic 54.84 0.2263 13.0334 0.9919

TABLE 5. PARAMETER ESTIMATION OF DS-2

Parameters Models M1 M2 Proposed Model

a 354.78 351.41 332 b 0.0889 0.0888 0.008 α - - 0.01989 β - - 0.00263 p - - 0.99173 � - 0.01 0.001

TABLE 6. COMPARISON CRITERIA FOR DS-2

Models R2 MSE SSE Variation RMSPE

M1 0.978 4293.23 225.96 14.87 15.41 M2 0.9781 4284.09 225.48 14.86 15.398

Proposed Model 0.986 2831.54 149.03 12.17 12.523

FIGURE 2

Goodness of fit curves for DS-2

0

50

100

150

200

250

300

350

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19Time

Cum

mul

ativ

e nu

mbe

r of f

aults

Actual DataM1M2Proposed Model

VII. CONCLUSIONS The concept of learning has been incorporated in

the fault detection rate to show the effect of learning of the testing team as the testing grows. In this paper a new SRGM with imperfect debugging and fault generation using learning function has been developed. The models have been validated, evaluated, and compared with other existing NHPP models by applying it on two datasets. The results show that the proposed models provides improved goodness of fit for software failure occurrence / fault removal data due to their applicability and flexibility. The concept of change point can be incorporated in the proposed model. In this paper we have taken constant probability of perfect debugging as well as constant error generation however these may vary with time. It provides us new research directions to find more realistic SRGM.

REFERENCES

[1] Brooks, WD and Motley, RW “Analysis of discrete software

reliability models-Technical Report (RADC-TR-80-84)”, Rome Air Development Center, New York,1980.

[2] Goel, A.L., Software Reliability Models: Assumptions, Limitations and Applicability, IEEE Transactions on Software Engineering; SE-11, pp. 1411-1423, 1985.

[3] Goel A.L., Okumoto K., “Time-dependent error-detection rate model for software reliability and other performance measures”, IEEE Transaction Reliability, vol. 28, pp. 206–211, 1979.

[4] Huang, C-Y, S-Y. Kuo and J.Y. Chen, “Analysis of a software reliability growth model with logistic testing effort function”, in Proceeding of 8th International Symposium on software reliability engineering, 378-388, 1997.

[5] Huang et al. Chin-Yu “Software Reliability Analysis by Considering Fault Dependency and Debugging Time Lag” IEEE Transaction on Reliability, Vol.35, No.3 pp.436-449, 2006.

[6] Kapur P.K., Garg R.B. “Optimal release policies for software systems with testing effort”, International Journal of Systems Science, Vol. 22, No. 9. pp. 1563-1571, 1991.

[7] Kapur P.K., Garg R.B. “A software reliability growth model for an error removal phenomenon”, Software Engineering Journal, Vol. 7, pp. 291-294, 1992.

[8] Kapur P.K., Garg R.B. and Kumar S., “Contributions to hardware and software reliability”, World Scientific, Singapore,1999.

[9] Kapur P. K., Kumar D., Gupta A., Jha P. C., On How To Model software Reliability Growth in the Presence Of Imperfect Debugging and Fault Generation, Proceedings of 2nd International Conference on Reliability and safety Engineering, pp. 515-523, 2006.

[10] 10. Khatri SK, Kapur PK and Johri P; “Flexible Discrete Software Reliability Growth Model for Distributed Environment Incorporating two types of Imperfect Debugging”, IEEE Proceedings of 2012 Second International Conference on Advanced Computing & Communication Technologies, doi 10.1109/ACCT.2012.54, pp. 57-63, 2012.

[11] Kumar Deepak, “ Software Reliability Engineering: A Brief Description” Lambert Academic Publishing, Germany, 2010.

[12] Lyu M.R., “Handbook of Software Reliability Engineering”, McGraw-Hill: New York, 1996.

[13] Lo.Jung-Hua, Huang Chin-Yu, “An integration of fault detection and correction processes in software reliability analysis”, The Journal of Systems and Software, vol. 79, pp.1312-1323, 2006.

[14] Mishra PN “ Software reliability analysis”, IBM Systems Journal, 22:262- 270, 1983.

[15] Musa J.D., Iannino A., Okumoto K., “Software Reliability: Measurement, Prediction, Application”, McGraw-Hill, New York, 1987.

[16] Ohbha M. “Inflection S-shaped Software Reliability Growth Model”, Lecture notes in Economics and Mathematical Systems Ed. S. Osaki and Y.Hotoyama, Springer Verlag,144-,1984.

[17] Ohba M. and Chou, X.M., “Does Imperfect Debugging Effect Software Reliability Growth”, proceedings of 11th International Conference of Software Engineering, pp 237-244, 1989.

[18] Pham H, “System software reliability”, Springer series in Reliability Engineering, 2006.

[19] Pillai K. and Nair, VSS., “A model for software development effort and cost estimations”, IEEE Transactions on software engineering, vol. 23(8), pp. 485-497, 1997.

[20] Schneidewind, N.F., “Analysis of error processes in computer software”, Sigplan Notices, vol. 10, pp.337–346, 1975.

[21] Yamada, S., Ohba, M., Osaki, S., “S-shaped reliability growth modeling for software error detection”, IEEE Transaction on Reliability, vol. 32, pp.475–478, 1983.

[22] Yamada S., Tokuno K. and Osaki S., “Imperfect debugging models with fault introduction rate for software reliability assessment”, Int. J. Syst. Sci., 23 (12), 1992.

[23] Yamada S, Ohtera H, Narihisa H “Software Reliability Growth Models with Testing Effort ” IEEE Transaction on Reliability, Vol. 35, pp. 19-23, 1986.

Documents

[IEEE 2012 CSI Sixth International Conference on Software Engineering (CONSEG) - Indore, Madhay Pradesh, India (2012.09.5-2012.09.7)] 2012 CSI Sixth International Conference on Software