Upload
trinhthu
View
220
Download
0
Embed Size (px)
Citation preview
1
Approaches to Evaluation in Software Engineering
Carl-Fredrik Sørensen,PhD Trial Lecture16.02.2006, Trondheim
2
Outline
• What, Why, Who, How, Where to evaluate? – Objects to study?– Objective of evaluation?– Effects studied?– Context of evaluation?
• Evaluation in SE Industry.• Evaluation in SE Research.• Challenges• Summary
3
What to evaluate?What are the objects of study?
• What, Why, Who, Where, How to evaluate?
• Evaluation in SE Industry.
• Evaluation in SE Research.
• Challenges• Conclusion
4
Software Engineering (SE)The IEEE Computer Society defines software engineering as:1. “The application of a systematic, disciplined, quantifiable
approach to the development, operation, and maintenance of software; that is, the application of engineering to software.”
2. “The study of approaches as in (1).”
Ian Sommerville, Software Engineering, Addison-Wesley, 2001:– An engineering discipline which is concerned with all
aspects of software production from the early stages of system specification through to maintaining the system after it has gone into use.
5
Knowledge Areas in Software Engineering (SWEBOK)
• Software requirements• Software design• Software construction• Software testing• Software maintenance
• Software configuration management
• Software engineering management• Software engineering process• Software engineering tools and
methods• Software quality
6
Why evaluate in SE? What are the objectives?
• What, Why, Who, Where, How toevaluate?
• Evaluation in SE Industry.
• Evaluation in SE Research.
• Challenges• Conclusion
7
General Evaluation Objectives
• General: Understand state-of-practice. Confirm theories or conventional wisdom.
• Exploration: when an area is not well understood.
• Description: describe the current state of things.
• Prediction: predicting the future.• Explanation: explaining why things happen.
8
SE Evaluation Objectives
• Understand the software process and product.• Define, measure and validate qualities of
process and product.• Evaluate and confirm successes and failures.• Information feedback for project control.• Learn from experience. Learn to predict
(planning).• Evaluate technology.• Improve software development.
9
What is the context of evaluation?
• What, Why, Who, Where How, to evaluate?
• Evaluation in SE Industry.
• Evaluation in SE Research.
• Challenges• Conclusion
10
How to evaluate?
• What, Why, Who, Where, How to evaluate?
• Evaluation in SE Industry.
• Evaluation in SE Research.
• Challenges• Conclusion
11
Approaches to Evaluation
• Methods for model definition.
• Definition of measurements or metrics.• Methods for data capture.• Methods for analysis.• Managing validity treats.• Research methods designed for objective
evaluation.
12
Quantitative Research
• Why:– Provides data that can be used to generate
statistical results.• What data:
– Numbers or discrete categories (Interval, Ratio).• Used to evaluate:
– Hypotheses. – Cause-effect relationships.
13
Qualitative Research
• Why:– Combination of technical and human behaviour.– Complexity of human behaviour difficult to quantify.– Provides more explanatory information.
• What data:– Words, pictures, observations, interviews, diaries etc.
(nominal, ordinal).
• Used to evaluate:– Non-tangible objects like people, organisations, processes,
etc.– Relationships between technology and people.
14
Qualitative Vs. Quantitative
• Qualitative data often assumed subjective.• Quantitative data assumed objective.• Subjectivity, objectivity orthogonal to data!!• Combinations:
– Qualitative explains reasons behind hypothesis and relationships.– Anomalies described different.– Increases amount of information.– Increases diversity of the data increases confidence of results.– Often alternations within empirical studies.
15
Validity and Reliability
• Construct validity: Variables accurately model hypotheses – right metrics.
• Internal validity: Changes in dependent variables are attributed change in independent variables – right data.
• Conclusion validity: Concerned with relationship between treatment and outcome – right statistics.
• External validity: Generalisation of results outside the study context – right respondents/sample.
• Threats to validity: Factors that influence interpretation and the ability to draw conclusions.
16
Industrial SE evaluation• What, Why, Who,
How, Where to evaluate?
• Evaluation in SE Industry.
• Evaluation in SE Research.
• Challenges• Conclusion
17
Evaluate SE practice• Goals of evaluation:
– Increase quality.– Budget compliance.– Eventual software process improvement –
learning organisation.
18
Product Evaluation• Right product? – Software Requirements:
– Conducting of requirements reviews, prototyping, model validation and acceptance tests.
• Make it right? – Software Architecture/Design: – Definition of quality attributes, quality analysis, and design
reviews.
• Product – Software construction:– Validation: against requirements, quality attributes, user
satisfaction.– Verification: Software testing (as developer, as user).
19
Process Evaluation• Software engineering management:
– Determining satisfaction of requirements.– Reviewing and evaluating performance.
• Software engineering process:– Define/use process assessment models and methods.– Define/use process and product measurement (Size, structure,
quality).
• Tools and methods:– Software process improvement.– Evaluate effect of introduction/use.– Feature analysis and tools benchmarking.
20
Industrial Process Measurement• Two general types: analytic and benchmarking.• Benchmarking: Adoption of best practice. • Analytical techniques: Rely on “quantitative evidence”.
– Quality Improvement Paradigm (QIP). – Experimental Studies: controlled or quasi
experiments.– Personal Software Process: on the individual level.
21
Software verification & validation • Disciplined approach to assessing software products
throughout the product life cycle – Addresses software quality – Locate defects.– Conformance to requirements.
• Quality Enhancing Methods:– Formal verification wrt. specification.– Inspections and reviews.– Testing.– User Acceptance.
22
Inspections, reviews & walkthrough
• Static verification and validation.• Aim: Find defects earlier in the software process.• Defects are costly, 10x increase in cost for
finding/correcting in a later phase in project.
• Practice: Detection and correction of software problems are often deferred until late in software projects.
• Reading techniques important!
23
Testing
• Dynamic verification and validation.• Defect detection with respect to expected behaviour.• Ensure functionality and reliability.• Applicable to the whole development process.
24
Integrate and test components
Write User Requirements
Write System requirements
Develop product Architecture
Design Components
Implement Design –build components
Verify User Requirements
Verify System Requirements
Component verification
Testing
Requirements traceability
Inte
grat
ion
and
Test
ITADS
Verification V Diagram
ITADS = Inspection, Test, Analysis, Design, Similarity
Validated product
ITADS
Test
Test
25
Research Methods in Software Engineering • What, Why, Who,
How, Where to evaluate?
• Evaluation in SE Industry.
• Evaluation in SE Research.
• Challenges• Conclusion
26
Research in SE
• Research approaches are typically:– Descriptive, Evaluative, Formulative, or Predictive.
• General categories of research approaches in SE: – Scientific method – formal/mathematical, – Engineering method,– Empirical method, and – Analytical method – variant of the scientific
method.
27
Scientific method
• Top-down approach.• Emphasis on finding better formal methods and
languages. • Builds mathematical models of phenomena.• Simulate the models and refine the models in iterations.• Mostly quantitative data.
• This method does not scale up in SE!
28
Engineering method
• Bottom-up approach.• Emphasis on finding better methods for structuring
large systems and software development.• Software development is viewed as a creative task
which cannot be controlled other than through rigid constraints on the resulting product.
29
Empirical method• Compares beliefs to observations (facts).• Helps to understand how and why “things” work.• Understanding allows for changes/improvement.• Generation of theory – Exploratory. • Strengthening or confirmation of research
propositions/theories – Confirmative.
• Building, testing, applying, and refining theory.• Not able to prove hypotheses!
– Strengthen or weakening.
30
Experimentation in Software Engineering• Observational methods.• Historical methods.• Controlled methods.• Mixed methods.
31
Observational methodsValidation method
Description Strengths Weaknesses
Project Monitoring
Collect development data
Monitor project in depth
Use ad hoc validation techniques
Monitor multiple projects
Provides baseline for future; Inexpensive
No specific goals
Case Study Can constrain one factor at low cost.
Poor control for later replication. Not useful for prediction. Validity threats.
Assertion Serves as a basis for future experiments
Insufficient validation
Field Study Inexpensive form of replication
Treatments differ across projects
32
Historical methodsValidation method
Description Strengths Weaknesses
Literature search
Examine previously published studies
Examine data from completed projects
Examine qualitative data from completed projects
Examine structure of developed product
Large available database; Inexpensive
Selection bias; treatments differ
Legacy data
Combines multiple studies; Inexpensive
Cannot constrain factors; data limited
Lessons learned
Determine trends; Inexpensive.
No quantitative data; cannot constrain factors
Static analysis
Can be automated; Applies to tools
Not related to development method
33
Controlled methodsValidation method
Description Strengths Weaknesses
Replicated experiment
Develop multiple versions of product/processReplicate one factor in laboratory setting
Dynamic analysis
Execute developed product for performance
Can be automated; Applies to tools
Not related to development method
Simulation Execute product with artificial data
Can be automated; Applies to toolsEvaluation in safe environment
Data may not represent reality; Not related to development method
Can control factors for all treatments
Very expensive; Hawthorne effect
Synthetic environment experiments
Can control individual factors; moderate cost
Scaling up; interactions among multiple factors
34
Mixed research methodsValidation method
Description Strengths Weaknesses
Survey Ask questions to a population/sample
Mix between experiment and case study
Grounded theory
Building theory from collected data
Creates new knowledge
No clear mission
Post-mortem analysis
Examine data from completed projects.Mix survey/case study
Real life project experiences
As case studies
Statistical valid. Relatively cheap.
Questionnaire bias; honesty of responses. Many alternative explanations possible
Action research
Achieves practical outcome; Creates new knowledge
Same as case study.Validity threats
35
Challenges• Perform externally valid experiments.• Create better empirical studies.
– Define quantifiable theories/models and metrics.– Draw more credible conclusions from them.
• Selection of appropriate method.– One size does not fit all.– Dependent on the research questions asked.
36
Summary
• Many different areas to evaluate.• Different approaches for industry and academia. • Different approaches for each object to evaluate.• Research questions decide method.• Mix and match of methods often useful.• Important to evaluate actual practice to provide
improvements.
37
References• Avison, David; Lau, Francis; Myers, Michael and Nielsen, Peter Axel, "Action
Research", Communications of the ACM, Vol. 42, No. 1, pp.94-97, 1999.• Ciolkowski, Marcus, Oliver Laitenberger, Dieter Rombach, Forrest Shull, and
Dewayne Perry. Software Inspections, Reviews & Walkthroughs. ICSE’2002, Orlando, Florida, USA. 641-642
• Conradi, Reidar and Alf Inge Wang, editors. Empirical Methods and Studies in Software Engineering. Springer, Heidelberg, Germany, 2003. LNCS 2765.
• Farbey, Barbara and Anthony Finkelstein. Evaluation in Software Engineering: ROI, but more than ROI. Proc. of the 3rd International Workshop on Economics-Driven Software Engineering Research (EDSER-3), 2001.
• Perry, Dewayne, Adam Porter and Lawrence Votta. Empirical Studies of Software Engineering: A Roadmap. ICSE’2000.
• Seaman, Carolyn B.. Qualitative Methods in Empirical Studies of Software Engineering. IEEE Transactions on Software Engineering, 25(4):557–572, July/August 1999.
• Moody, Daniel. Empirical Research Methods. Lecture in IT Topics, 2002.
38
References• Sommerville , Ian, “Software Engineering”, Addison-Wesley, 2001, 6th Edition• Tichy, Walter F., "Should Computer Scientists Experiment More?", Computer,
31(5):32-40, May 1998.• Wohlin Claes, Per Runeson, Martin Höst, Magnus C. Ohlsson, Björn Regnell
and Anders Wesslén. Experimentation in Software Engineering - An Introduction. Kluwer Academic Publishers, 2000.
• Zelkowitz , Marvin V. and Dolores R. Wallace. Experimental Models for Validating Technology. Computer, 31(5):23–31, May 1998.
• IEEE Computer Society. SWEBOK: Guide to the Software Engineering Body of Knowledge - 2004 Version. http://www.swebok.org/ironman/pdf/SWEBOK_Guide_2004.pdf
• Thomas J., FR-HiTEMP Ltd. Build, Integrate and Test. Presentation slides. www.secam.ex.ac.uk/teaching/ug/studyres/SOE3215-6/6a%20Build%20Int%20Test.ppt