31
The Scientific Method in Software Evaluation Speaker: Benjamin Heitmann <[email protected] > 1

Applying the scientific method in Software Evaluation

Embed Size (px)

DESCRIPTION

Is Computer Science a real Science? If yes, then how does the scientific method apply to computer science. What are the benefits of doing experiments as a computer scientist? And how can we apply the scientific method to the evaluation of design and implementation of software?

Citation preview

Page 1: Applying the scientific method in Software Evaluation

The Scientific Method in Software Evaluation

Speaker: Benjamin Heitmann <[email protected]>

1

Page 2: Applying the scientific method in Software Evaluation

Benjamin Heitmann: The Scientific Method in Software Evaluation

Motivation

• The Scientific Method guarantees a certain degree of objectivity in Science

• following it improves quality of work

• our work takes the form of Software

• part of the scientific method is evaluation

• How can Software be evaluated?

2

2

Page 3: Applying the scientific method in Software Evaluation

Benjamin Heitmann: The Scientific Method in Software Evaluation

Overview

• Is Computer Science a Science ?

• The Scientific Method

• Reasons to (not) Experiment

• Evaluating the Design

• Evaluating the Implementation

3

Why

How

3

Page 4: Applying the scientific method in Software Evaluation

Is Computer Science a Science?

Reference: Denning, Is Computer Science Science ?

4

Page 5: Applying the scientific method in Software Evaluation

Benjamin Heitmann: The Scientific Method in Software Evaluation 5

Denning’s Definition

Chapter 1 of 5

Computer Science:

the science of information processes and their interactions with the world

• studies artificial and natural information processes

• blend of Science, Engineering, Mathematics and Art

5

Page 6: Applying the scientific method in Software Evaluation

Benjamin Heitmann: The Scientific Method in Software Evaluation

Other Definitions

• the study of phenomena related to computers [Newell, Perlis and Simon, 1967]

• the study of information structures [Wegner, 1968]

• the study and management of complexity [Dijkstra, 1969]

6 Chapter 1 of 5

6

Page 7: Applying the scientific method in Software Evaluation

Benjamin Heitmann: The Scientific Method in Software Evaluation

Real Science

Francis Bacon:

Scientific Method:

• Process of forming hypotheses and verifying them through experimentation

• Successful hypotheses become models explaining and predicting the world

7 Chapter 1 of 5

7

Page 8: Applying the scientific method in Software Evaluation

Benjamin Heitmann: The Scientific Method in Software Evaluation

Example

• Example: the study of algorithms

• e.g.: sort algorithms

• Hypothesis:

An algorithm executes with certain performance, time and storage requirements

• Verification: by experimentation with real world data

8 Chapter 1 of 5

8

Page 9: Applying the scientific method in Software Evaluation

Benjamin Heitmann: The Scientific Method in Software Evaluation

Non-obvious problems solved by computing principles

Computation:

Non-computability of some important problems

Optimal algorithms for some common problems

Communication:

Lossy but high-fidelity audio compression

Secure cryptographic key exchange in open networks

Design: Objects and information hiding

9 Chapter 1 of 5

9

Page 10: Applying the scientific method in Software Evaluation

Benjamin Heitmann: The Scientific Method in Software Evaluation

Interdisciplinary Research

Constantly new relationships with other research fields:

• Bioinformatics

• Cognitive science

• Computer linguistics

• Immersive computing

• Quantum computing

10 Chapter 1 of 5

10

Page 11: Applying the scientific method in Software Evaluation

Benjamin Heitmann: The Scientific Method in Software Evaluation

True Science needs Validation

11

Tichy:

50 % of CS papers proposing models and hypotheses are not validated (papers published before 1995)

In other fields, only 15 % not validated.

Chapter 1 of 5

11

Page 12: Applying the scientific method in Software Evaluation

The Scientific MethodReference:

Dodig-Crnkovic, Scientific Methods in Computer Science

12

Page 13: Applying the scientific method in Software Evaluation

Diagram of the Scientific Method

13 Chapter 2 of 5

Recursive and cyclic

Predictions3

Tests and new Observations

4

Hypothesis2Existing

Theories and Observations

1

Confirm old theory or

propose new theory

5

Select among competing theories

6

consistencyachieved

adjust Hypothesis

13

Page 14: Applying the scientific method in Software Evaluation

Benjamin Heitmann: The Scientific Method in Software Evaluation

Advantages

• Reproducible Results

• Impartiality

• Acceptance based on results of logical reasoning, observations and experiments

14 Chapter 2 of 5

14

Page 15: Applying the scientific method in Software Evaluation

Benjamin Heitmann: The Scientific Method in Software Evaluation

Scientific Methods of Computer Science

• Overlying principle: Modeling

• Theoretical CS

• Experimental CS

• Computer Simulation

15 Chapter 2 of 5

15

Page 16: Applying the scientific method in Software Evaluation

Reasons to (not) experiment

Reference:Tichy, Should Computer Scientists Experiment More?

16

Page 17: Applying the scientific method in Software Evaluation

Benjamin Heitmann: The Scientific Method in Software Evaluation

Reasons for experimentation

• Falsification of theories

• Revealing of assumptions

• Eliminate alternative explanations

• Enable induction: deriving theories from observation

17 Chapter 3 of 5

17

Page 18: Applying the scientific method in Software Evaluation

Benjamin Heitmann: The Scientific Method in Software Evaluation

But “experiments cost to much”

• Experimentation requires more resources then theory, so why do it?

• Rebuttal:

• Meaningful experiments can fit into small budgets

• Expensive experiments can be worth more than their cost

18 Chapter 3 of 5

18

Page 19: Applying the scientific method in Software Evaluation

Benjamin Heitmann: The Scientific Method in Software Evaluation

But “too many variables”

• Impossible to control all variables

• useless results

• Rebuttal:

• create benchmarks

• control variables and errors like e.g. in medicine

19 Chapter 3 of 5

19

Page 20: Applying the scientific method in Software Evaluation

Benjamin Heitmann: The Scientific Method in Software Evaluation

But “experimentation slows progress”

• Slower flow of ideas, if experimentation is required

• Rebuttal: better identification of questionable ideas

20 Chapter 3 of 5

20

Page 21: Applying the scientific method in Software Evaluation

Benjamin Heitmann: The Scientific Method in Software Evaluation

But “technology changes to fast”

• Experiment duration exceeds lifetime of software tool / product

• Rebuttal: question is too narrow

21 Chapter 3 of 5

21

Page 22: Applying the scientific method in Software Evaluation

Evaluating the Design

Reference:Leist, Zellner; Evaluation of Current Architecture

Frameworks

22

Page 23: Applying the scientific method in Software Evaluation

Benjamin Heitmann: The Scientific Method in Software Evaluation

Methodology

1. establish theoretical foundation

2. create list of features based on foundation

3. analyse if subjects exhibit features

23 Chapter 4 of 5

23

Page 24: Applying the scientific method in Software Evaluation

Benjamin Heitmann: The Scientific Method in Software Evaluation

Evaluating Enterprise Architecture Descriptions

• Theoretical foundation: Method Engineering

• List of features:

• Meta Model

• Procedure Model

• Modeling Technique

• Role document

• Specification Document

24 Chapter 4 of 5

24

Page 25: Applying the scientific method in Software Evaluation

Table 1 Evaluation of current Architectures

AR

IS

C4IS

R/D

oD

AF

FE

AF

MD

A

TE

AF

TO

GA

F

Zac

hm

an

Specification

document ! ! ! ! ! " !

Meta model ! # " ! # " #

Role " # ! # ! " "

Technique ! ! " # " # "

Procedure model " ! ! ! ! ! #

Legend: ! Fully accomplished

# Partly accomplished

" Not accomplished

For most of the architecture frameworks the strengths and

weaknesses are very balanced. The adequacy of a architecture

framework to support a development project could be evaluated

only in consideration of specific circumstances of a case.

The results of the evaluation indicate at the same time potentials

for further development of the frameworks. For example in FEAF

and TOGAF a meta model is not defined. The meta models

determine all objects and all their relationships, which are

modeled to describe the enterprise architecture, in order to

establish the consistency of the models. Since the only constant in

today’s business world is change architecture models will often be

adapted. Therefore the specification of meta models is of utmost

importance to assure the consistency of architecture descriptions

and would be a great benefit for the architecture frameworks.

The development of a role model would be a valuable

advancement for some architecture frameworks too. The role

model constitutes responsibilities for certain tasks, which support

not only the development but also the maintenance of the

architecture descriptions. Including common techniques is helpful

for the reuse of universal knowledge. The procedure model is

often requested in practice because it itemizes what to do and the

specification documents therefore define targets. To sum up, the

completion would be in general a great benefit for each

architecture framework.

6. CONCLUSIONS

The objective of this paper was to evaluate well-known

architecture frameworks regarding their contribution to support

architecture development projects. Since well-founded

investigations to the term and the constituent elements of a

method exist we could derive several requirements for the

development of enterprise architecture and their descriptions.

A main result of the evaluation is that no framework meets all

requirements regarding the constitutive elements of a method.

Based on these findings further potentials for developing

architecture frameworks are pointed out: for example using a

procedure model makes it easier to develop architecture

descriptions, because of the ability to follow a structured

procedure. Techniques are useful for producing the required

specification documents within the procedure model.

Another interesting result is that strengths and weaknesses are

very balanced for each architecture framework. The evaluation

shows for each framework which requirements regarding a

methodic procedure it meets. Furthermore the selection of an

appropriate framework must consider the situation and the

objectives for which the framework is used. For that reason

further research on improving architecture frameworks needs to

focus on situations and objectives for which the frameworks are

appropriate.

7. REFERENCES

[1] Architecture Board ORMSC, Model Driven Architecture

(MDA), Miller, J., Mukerji, J. (Eds.), 2001,

http://www.omg.org/cgi-bin/apps/doc?ormsc/01-07-01.pdf,

(Access: 17. July 2005).

[2] Blevins, T.J., Spencer, J., Waskiewicz, F., TOGAF ADM

and MDA, 2004, http://www.opengroup.org/cio/MDA-

ADM/MDA-TOGAF-R1-070904.pdf, (Access: 17. July

2005).

[3] Brancheau, J.C., Building and implementing an information

architecture, DATA BASE, Summer (1989), 9-17.

[4] Brinkkemper, S., Method Engineering with Web-enabled

Methods, in: Brinkkemper, S., Lindencrona, E., Solvberg,

A. (Eds.), Information Systems Engineering: State of the Art

and Research Themes (Springer, London, 2000), 123-133.

[5] Brinkkemper, S., Method Engineering: Engineering of

Information Systems Development Methods and Tools,

Journal of Information and Software Technology, Vol. 38,

Nr. 4 (1996), 275-280.

[6] CIO (Chief Information Officer) Council, A Practical Guide

to Federal Enterprise Architecture, Version 1.0, (February

2001).

[7] Department of the Treasury (CIO (Chief Information

Officer) Council), Treasury Enterprise Architecture

Framework, Version 1, July 2000, 2000,

http://www.eaframeworks.com/TEAF/teaf.doc, (Access: 15.

July 2005).

[8] DoD Architecture Framework Working Goup, DoD

Architecture Framework Version 1.0 - Volume I:

Definitions and Guidlines, 2004,

http://www.tricare.osd.mil/jmis/download/DoDArchitecture

FrameworkVersion1Feb2004.pdf, (Access: 26. August

2005).

[9] DoD Architecture Framework Working Goup, DoD

Architecture Framework Version 1.0 - Volume II: Product

Descriptions, 2004,

http://www.defenselink.mil/nii/doc/DoDAF_v1_Volume_II.

pdf, (Access: 26. August 2005).

[10] Enterprise Architecture Community, What is Enterprise

Architecture?, 2001, http://network.sharedinsights.com/ea,

(Access: 15. July 2005).

[11] Ferstl, O.K., Sinz, E.J., SOM Modeling of Business

Systems, in: Bernus, P., Mertins, K., Schmidt, G. (Eds.),

Handbook of Architecture of Information Systems (Springer,

Berlin et al., 1998), 339-358.

[12] Frankel, D.S., Harmon, P., Mukerji, J., Odell, J., Owen, M.,

Rivitt, P., Rosen, M., Soley, R.M., The Zachman

Framework and the OMG´s Model Driven Architecture,

Business Process Trends (2003).

1552

25 Chapter 4 of 5

25

Page 26: Applying the scientific method in Software Evaluation

Evaluating the Implementation

Reference: Bansiya, Evaluating Framework Architecture Structural

Stability

26

Page 27: Applying the scientific method in Software Evaluation

Benjamin Heitmann: The Scientific Method in Software Evaluation

Methodology

1. establish hypothesis

2. Identify characteristics related to hypothesis

3. define metrics to assess each characteristic

4. collect data from metrics

5. determine if data supports hypothesis

27 Chapter 5 of 5

27

Page 28: Applying the scientific method in Software Evaluation

Benjamin Heitmann: The Scientific Method in Software Evaluation

Evaluating Structural Stability of Windows Application Frameworks

• Hypothesis: extent of change stabilises with maturity

Characteristics Metric

Design Size in Classes count

number of single/multiple inheritances count

average inheritance depth/width

divide sum of maximum path lengths of all classes

by number of classes

28 Chapter 5 of 5

28

Page 29: Applying the scientific method in Software Evaluation

29 Chapter 5 of 5

6

2.4.1. Computing Extent-of-Change

To compute the extent-of-change, the metric values for the frameworks are normalized with respect to

the metric’s values in the previous version of the frameworks. The relative values of metrics in the

first versions of frameworks are set to unity for reference. Table 4 shows the normalized metric

values (computed by dividing the actual metric values of a version with the metric’s value in the

previous version) of the MFC and OWL frameworks. The normalized metric values of the

framework versions are summed to compute an ‘aggregate-change’ measure [Table 4]. This

aggregate-change value for the first versions of the framework (V1) equals the number of metrics

used in the assessment process. The extent-of-change value for subsequent versions is computed by

taking the difference of the aggregate-change (Vi) of a version (i,with i > 0) with the aggregate-

change (V1) value of the first version.

Table 4. Normalized architecture assessment metric values of MFC and OWL frameworks

Design Metric MFC

1.0

MFC

2.0

MFC

3.0

MFC

4.0

MFC

5.0

OWL

4.0

OWL

4.5

OWL

5.0

OWL

5.2

Design Size 1 1.28 1.43 1.56 1.13 1 1.73 2.51 1.00

Hierarchies 1 1 1 5 1.20 1 1.33 1.81 0.93

Single Inheritances 1 1.27 1.39 1.49 1.11 1 1.65 3.61 0.98

Multiple Inheritance 1 1.00 1.00 1.00 1.00 1 1.44 2.31 1.00

Depth of Inheritance 1 1.26 1.16 0.95 0.99 1 1.04 1.75 0.99

Width of Inheritance 1 0.99 0.98 0.96 0.98 1 0.94 1.33 0.98

Number of Parents 1 1.26 1.16 0.95 0.99 1 0.97 1.61 0.99

Number of Methods 1 1.70 1.26 1.20 0.96 1 0.87 1.55 0.99

Class Coupling 1 1.26 1.19 1.03 0.96 1 1.21 2.09 1.00

Aggregate Change (Vi ) 9.0 11.01 10.57 14.14 9.31 9.0 11.18 18.57 8.86

Extent-of-Change(Vi -V1) 0.0 2.01 1.57 5.14 0.31 0.0 2.18 9.57 -0.14

The extent-of-change measure can be used as an indicator (index) of framework architecture

stability. The value of the measure indicates the relative stability of the architecture structure, while

higher values of the measure are reflective of higher instability in the structure, values closer to zero

indicate greater stability. MFC, version 5.0 has a stability index of 0.31, which is close to zero and

29

Page 30: Applying the scientific method in Software Evaluation

Benjamin Heitmann: The Scientific Method in Software Evaluation

• Computer Science is a true science

• In order to have measurable results you have to choose hypotheses which can actually be measured.

• Remember falsifiability tests!

• Measure your own progress, by evaluating your approach

30

Conclusion

30

Page 31: Applying the scientific method in Software Evaluation

Benjamin Heitmann: The Scientific Method in Software Evaluation

References:

Denning, Is Computer Science Science? : http://cs.gmu.edu/cne/pjd/PUBS/CACMcols/cacmApr05.pdf

Dodig-Crnkovic, Scientific Methods in Computer Science: http://www.mrtc.mdh.se/publications/0446.pdf

Tichy, Should Computer Scientists Experiment More? : http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=675631

Leist, Zellner, Evaluation of Current Architecture Frameworks : http://portal.acm.org/citation.cfm?id=1141635

Bansiya, Evaluating Framework Architecture Structural Stability: http://portal.acm.org/citation.cfm?doid=351936.351954

Brown, Wallnau, A Framework for Evaluating Software Technology: http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=536457

31