48
Lecture topics Software process Software project metrics Software project management

Lecture topics

Embed Size (px)

DESCRIPTION

Lecture topics. Software process Software project metrics Software project management. Does software have a life?. Software lifecycle is the sequence of stages the software goes through during its “lifetime” Software is born Requirements, design, coding, testing Software lives - PowerPoint PPT Presentation

Citation preview

Page 1: Lecture topics

Lecture topics

Software processSoftware project metricsSoftware project management

Page 2: Lecture topics

Does software have a life?

Software lifecycle is the sequence of stages the software goes through during its “lifetime”

Software is bornRequirements, design, coding, testing

Software livesMaintenance

Software diesSoftware retirement

Software process governs software lifecycle

Page 3: Lecture topics

What is software process

A framework for a set of key areas necessary for successful production of software

GeneralApplicable to most software projects

Outlines major tasksRequirements, specifications,…

Defines activities for each taskQuality assuranceMeasurement of progressDocument preparation

Page 4: Lecture topics

Why do we need software process?

A look back at mechanical engineeringIn the 1890, a mechanical engineer

Frederick W. Taylor invented “scientific management”

The idea was that the way in which things are done is the key to better results• Improvements like using harder steel for the cutting

tools

The labor component is important•Only good operators can take advantage of the

better cutting tools

Extensive opposition movementMany engineers thought that Taylor’s

method wasn’t really engineering, but rather some non-technical hybrid

Page 5: Lecture topics

Why do we need software process (cont.)

What about development of software?In many cases, it’s a pretty chaotic process,

similar to mechanical engineering in 1800sOpinion of many managers: software

engineering is a bag of tricks to keep programmers in line

So, the process of software development should be studied, formalized, and controlled by engineering techniques

“Software processes are software too” -- Leon J. Osterweil

There is a split between technical and management software people on the process issue

“Process vs. product” controversy: what’s more important, organizing people or organizing products

Page 6: Lecture topics

Clash of issues: technical vs. managerial (or nerds vs. suits)Very different concerns

The problem of running a a large multi-person project is different from doing the work itself

This course didn’t really touch the managerial side

Engineers need managers!And vice versa, of course

Very few people are good at both technical and managerial jobs

Page 7: Lecture topics

Capability maturity model (CMM)

How do we measure the quality of a software process?

Need to do it to compare between organizations or to know how to improve software practices in a given organization

The Software Engineering Institute introduced the CMM model

Assigns a software development organization a maturity level

1 to 5, low to high maturity Ain’t no simple formula

Careful evaluation of of the organization is needed• Mostly about how its software projects are conducted

(established practices)

Introducing predictability into software development is a primary goal of the higher CMM levels

A high quality software process is not a guarantee of a high quality software product

But the likelihood of improving software quality is high

Page 8: Lecture topics

CMM levels

InitialAd hoc software development

RepeatableCost, schedule, functionality tracking

DefinedThe process is standardized

ManagedMeasurements of progress and quality are

usedOptimizing

The process is being constantly improved

Page 9: Lecture topics

Initial level

Might be better to call it “level 0”An organization may use many of the

ideas from CMM, but not in the order or manner described in the formal levels

Thus, it will be placed on this initial level

Page 10: Lecture topics

Repeatable level

Refers more to the ability to track cost, schedule, and functionality than to the routine exercise of this ability

The only technical reference in the formal definition of this level is configuration management

The requirements might seem modest, but this level is quite hard to achieve

Page 11: Lecture topics

Defined level

The management practices of level 2 are formally defined and recorded

Followed throughout the organization even when things go wrong

There must be a Software Engineering Process group within the organization that codifies practices

Page 12: Lecture topics

Managed level

The central concept is measurement of the development process and the software product

The product here includes requirements, design, code, documentation, test plans etc.

Page 13: Lecture topics

Optimizing level

Introduces feedback into the process from the measurements of level 4

E.g., if a project is behind schedule in its design phase,

A manager at level 4 will have measurements to show this and then will try to correct matters (e.g. by adjusting schedule)

A manager at level 5 will use data from the delinquent project to try to discover the root cause of the problem and change the development process itself•So that the problem does not occur in future

projects

Page 14: Lecture topics

Critique of CMM levels

Most descriptions of CMM levels are full of hype

Descriptions of different levels are not specific

The basis of CMM is mostly managerial (not technical)

The step from level 1 to level 2 is based on management alone

In general, effort should not be spent on process at the expense of effort on product

Unless there’s a clear indication that the product will benefit from that

Page 15: Lecture topics

Using CMM to evaluate a potential employerKnowing the CMM level of a potential

employer is a valuable data for an engineer

E.g. level 4 means that there is considerably more regimentation than at (say) level 2

Many employees at a level 4 organization will have rigid job description

Likely little scope for advancementExciting technical risks are not takenBut managerial personnel has more

opportunities for advancement

Page 16: Lecture topics

Process management is not for every organizationFirst off, there are two extremes

For a project involving a handful of people, process is often a waste of time

A project involving hundreds of people will not succeed without process

What about the non-extreme cases?E.g., suppose that development time for a

project is about 2 years, involving about 200,000 LOC

Technical model - hire about eight senior engineers who work essentially without any management hierarchy•Productivity about 1200 LOC/person-month

Managed model - hire 2 line managers and 16 junior engineers•Productivity about 500 LOC/person-month

Page 17: Lecture topics

Are all software processes born equal?

There are many different ways to organize software production

Different process modelsThe choice of a process model is based

onThe nature of the projectThe methods and tools that the

organization wants to useThe controls over software productionThe product

Page 18: Lecture topics

Waterfall process model

Does not represent the practice well Too rigid

Parallel production is limited All requirements must be specified fully in the

beginning

requirements

HL design

LL design

testing

coding

Page 19: Lecture topics

Prototyping process model (evolutionary development model in Sommerville)

This model is often practical!Customers may get wrong impression

about the final product from the prototype

Customers may ask for deployment before the product is ready

Often prototype flaws are not fixed in the final product

requirements Prototypedevelopment

PrototypeTest-drive

Page 20: Lecture topics

Rapid application development (RAD) process modelA number of software teams, each

Developing a well-defined part of the product

Using the waterfall modelBenefits:

Very rapid developmentComponent-based (reusable) products

Drawbacks:Requirements have to be well-understoodProduct decomposition is not always

possibleSensitive to lack of commitment

Page 21: Lecture topics

Incremental process model

Useful when deadline cannot be achieved directly

May require significant human resourcesIf large number of teams

requirements HL design LL design testingcoding

requirements HL design LL design testingcoding

requirements HL design LL design coding

Time

Page 22: Lecture topics

Spiral process model

Natural for large software systemsCustomers are “stuck” with the

development organization

start

Requirementssector

Testsector

codesector

Designsector

Page 23: Lecture topics

Concurrent development process model

May reduce development time by exploiting concurrency

None

Awaitingchanges

Underdevelopment

Underrevision

Underreview

Baselined

Done

Page 24: Lecture topics

Formal development process model

Similar to the waterfall model in its structure

Formal processes are used on each stage

Formal specifications on the requirements stage, including formal verification

Formal process of transforming requirements into design and implementation

Standard testing of the code

Page 25: Lecture topics

So, which process model is the best?

Depends on many parameters (nature of the product, availability of resources, organization, etc.)

The spiral model should probably be the choice in most cases

Driven by risk - in the first turn of the spiral, the developers decide if building of the system is feasible

Page 26: Lecture topics

Software metrics

What are they? Formulas for computing quantitative characteristics of

software development, deployment, and maintenance

Why do we need them? Consider the following scenario (Hamlet, Maybee):

Someone in the organization makes what is called a “business case” for a new product by estimating the revenue that will be lost day by day if it is not available. Then they guess how long the business can stand the loss and come up with a schedule for developing the product - a schedule that bears no relation to what is actually required to develop it. Engineers are then told: meet this schedule.

Software developers may think that the schedule is unrealistic, but how can they prove it?

E.g. through statistical measurements available for projects of comparable complexity

Page 27: Lecture topics

Primary way to measure software

The size of the projectLines of code (LOC)Functional points

Historical data provides a link between LOC for a project and the resources needed:

PeopleNumber of personnel and length of the

period they are neededTime

The whole process of developmentIndividual phases of development

Capital goodsComputers, desks, work rooms, pizzas,

cups of coffee, …

Page 28: Lecture topics

But how would we know the size of a system before it is built?Historical data

We did something like that in the past…Estimation models

Not many people have personal experience with software projects of different sizes

Models summarize experience in equations that relate project size, schedule, and effort

Sophisticated: a 200,000 LOC project takes more than twice the resources of a 100,000 project

Page 29: Lecture topics

Can we do better than using LOC?

Functional points (FP) metric proposedBased on counting:

External input and output pointsUser interaction pointsExternal interfacesFiles used by the system

Each characteristic is evaluated based on its complexity (importance for the system) and assigned a weight

A word of caution: developed a long time ago

Before OO programmingBefore database penetrationBiased toward data processing systems

Page 30: Lecture topics

Functional points metric

Unadjusted function-point count formula:

E.g., let The number of inputs and outputs be 3, with

assigned weight 10 The number of user interactions be 2, with assigned

weight 5 The number of external interfaces be 5, with

assigned weight 3 The number of files used by the system be 2, with

assigned weight 2 Then UFC for this system is 3*10 + 2*5 + 5*3

+ 2*2 = 59

UFC = (number of elements of given type) X (weight)

Page 31: Lecture topics

COCOMO estimation model

COnstructive COst MOdelDeveloped by B. Boehm in the 1980sRecognizes 3 classes of projects:

Organic modeSmall, simple projects; democratically

configured teamsSemi-detached mode

Intermediate projects, a mix of rigid and non-rigid requirements

Embedded modeLarge projects, tight constraints

Defines 3 different levelsBasic, intermediate, advanced

Page 32: Lecture topics

Levels of the COCOMO model

BasicNeeds only the size in LOC

IntermediateNeeds LOC and a set of cost drivers

AdvancedNeeds LOC and cost driversApplies cost drivers to each activity of the

software process

Page 33: Lecture topics

Example: output from COCOMO for a 100,000 LOC project (Hamlet, Maybee)

Distributions:

Effort

(man-months)

Schedule

(months)

Personnel on board

Requirements/specification

(17%) 88.6 (27%) 6.0 14.7

Design/code

(55%) 286.7 (44%) 9.8 29.2

Integration and test

(28%) 146.0 (29%) 6.5 22.5

Model mode: semidetachedModel size: large (100,000 lines of code)

Total effort: 521.3 man-months, 152 man-hours/man-monthTotal schedule: 22.3 months

Page 34: Lecture topics

Rule-of-thumb facts from the COCOMO modelProjects in the range of 100,000 LOC

take about 2 yearsRequired effort is

20% for requirements/specification50% for design/coding30% for the rest

Staffing and distribution depend on the type of the project, but generally are

About 500 man-monthsDistributed roughly 30-40-30% among the

phases

Page 35: Lecture topics

Software quality metrics

Correctness The degree to which software performs the intended

functionMetric: number of defects per KLOC

Maintainability The ease with which software can be corrected,

adapted, or enhancedMetric: mean time to change

Integrity The degree to which software is protected against

attacksMetric: the success ratio of (known) attacks

Usability The degree of user-friendliness

Metric: the time period required to become efficient in the use of the system

Page 36: Lecture topics

Defect-related quality metrics

Defect removal efficiency (DRE)DRE = E/(E+D)

E is the number of errorsD is the number of defects

Can be used to estimate defect removal efficiency of process steps:

DREi = Ei/(Ei + Ei+1)Ei is the number of errors discovered on

step iEi+1 is the number of errors discovered

on step i+1

Page 37: Lecture topics

Using quality metrics in management

Performance of individuals and teams can be compared

Team A found 112 errors in their softwre component; team B found 240 errors in their component

Which team is better?After the deployment of the system, 5

defects were traced to software produced by team A and 2 defects were traced to software produced by team B

Which team is better?The DRE metric for teams A and B is .9

and .8Which team is better?

Using quality metrics for management is not easy and can be misleading

Page 38: Lecture topics

So, if managed software process is so great, how come Open Source is so successful?Background

Enthusiasts write software that is often quite good

Often done in collaboration by large groups of people

Informally!Open Source Foundation and Free

Software Foundation are organizations that support the notion of open source software

High profile open source projectsLinuxApache

Page 39: Lecture topics

A case study of open source software development: the Apache serverA. Mockus, R. Fielding, and J. Herbsleb

Appeared in ICSE’2000An attempt to investigate the claim

that open source software development can successfully compete with traditional commercial development methods

Page 40: Lecture topics

Characteristics of open software style (OSS) developmentBuilt by potentially large numbers

(hundreds and even thousands) of volunteers

Extremely geographically distributedParticipants rarely or never meet face to

faceWork is not assigned

People undertake the work they choose to undertake

There is no explicit system-level design, or even detailed design

There is no project plan, schedule, or list of deliverables

Page 41: Lecture topics

The Apache Web server

Began in February 1995An effort to coordinate existing fixes to the

httpd programNew architecture design by R. Thau in July

1995Apache httpd 1.0 released in January 1996

According to the Netcraft survey, the most widely deployed server

Over 50% of the 7 mil sites queriedDeveloper email list is used for

communication among developersProblem reporting database is used for

communication between users and developers

CVS archive is used for version control

Page 42: Lecture topics

The Apache development process

The Apache Group (AG) is an informal organization of developers

Only volunteers, with day jobsEach member can vote on the inclusion of

any code change and has write access to CVS

MembersPeople who have contributed for an

extended period of time (usually >6 months)

25 as of April 2000Core developers (about 15 at any given

time)Only a subset of AG active (4-6 usually)

Page 43: Lecture topics

The Apache development process (cont.)

Each developer iterates through a common sequence of actions

Discovering that a problem existsDetermining whether a volunteer will work

on itIdentifying a solutionDeveloping and testing the code within

their local copy of the sourcePresenting the code changes to the AG for

reviewCommitting the code and documentation to

the repository

Page 44: Lecture topics

The size of the Apache development communityAlmost 400 different people

contributed code182 people contributed to 695 problem

report related changes249 people contributed to 6092 non-PR

changes3060 different people submitted 3975

problem reports458 individuals submitted 591 reports that

caused a change to the code or documentation

Page 45: Lecture topics

Distribution of the work within the development communityThe top 15 developers contributed

more than 88% of added lines and 91% of deleted lines of code

A single person did about 20% of these66% of the PR related changes were

produced by the top 15 contributors

Page 46: Lecture topics

Code ownership

Hypothesis: a single person would write the vast majority of the code for a module

This didn’t happen!Of 42 .c files with >30 changes, 40 had at

least two (and 20 had at least 4) developers making more than 10% of the changes

Page 47: Lecture topics

What is the defect density of Apache code?It was more than in other four large

systems (undisclosed) it was compared to

The role of bloaty code is unclear, thoughApache did better than others in the

number of defects in pre-test stateThere is no provision for systematic system

test in OSSCode inspection is better under OSS?

Page 48: Lecture topics

Hypotheses based on this study

OSS projects will have a core of developers who control the code base

A group larger by an order of magnitude than the core will repair defects and an even larger group will report problems

Projects with a small number of developers besides the core will fail because of a large number of defects

In successful OSS projects, developers are also users

OSS developments exhibit very rapid responses to customer problems

Defect density in OSS project releases will generally be lower than in commercial code that has only been feature tested