1 Towards a Theory of Programming. 2 Roadmap Early Concepts & Thoughts Metrics UML Process Time...

Preview:

Citation preview

1

Towards a Theory of Programming

2

Roadmap

• Early Concepts & Thoughts• Metrics• UML Process• Time is Money

3

Abstraction – Underground Map

4

5

The Babylonian Tower Principle

programming languages still, and probably will ever fail to produce

abstraction mechanisms suitable for large software modules.

6

Babylonian Tower (cont.)

• Each language has a hierarchy of abstractions

• E.g., Java has 5 levels:– Methods– Classes– Files– Packages– Jar files

• Number of children (at any level) should be 7+/-2

• => Total number of methods in a manageable Java program should be < 105

7

Brooks: MMM

• X9 Factor• Surgical Team• Adding people to a late project makes it

later• People and Months are not

interchangeable• Flow diagrams are obsolete• No silver bullet

8

Recap: High Quality Design(interim definition)

• A design that – Minimizes the number of bugs– Minimizes the effort for adding new

features

9

OOP Metrics

• Chidamber & Kemerer 1994

• Some require full source code

• Other require only relationship between classes

• Motivation: Objective measurement of quality from the program itself

10

MCC: McCabe Cyclomatic complexity

• # branches in the method• Typical value: ~5

11

NOC: Number Of Children

• #direct subclasses• Typical value: unbounded

– E.g.: java.util.Iterator

• High value means– Reuse (good)– Coupling (bad)

• Low value means the opposite

12

DIT: Depth of Inheritance Tree

• #Ancestors• Typical value: 1-2

• High value means– Hard to understand – High cohesion

13

CBO: Coupling Between Objects

• #Classes on which a class directly depends

• Typical value: 30

• Low value means– Low coupling (good)– The code is using mostly primitive types (bad)

14

LCOM: Lack of Cohesion of Methods

• For each class build an undirected graph– A node for each method, field– An edge between a method and a field if the

method accesses the field– An edge between two methods if one of them is

calling the other– LCOM = #Strongly connected components in this

graph

• Good value: 1

15

Shortcoming of Metrics

• Easy to game the system

• No correlation to Quality– Because quality cannot be measured

• Not normalized

16

Package Cycles: FindBugs 0.72

17

Package Cycles: FindBugs 1.35

18

Package Cycles: Ant

19

Package Cycles: Antlr

20

Package Cycles: Summary I

• Destructive rather than Constructive– Based on negative points

• Blind spots

• => The more blind spots you have the better your score is !?

21

Package Cycles: Summary II

• Work only on statically typed languages

• Negative points

• => Dynamic languages will score very high

22

Hard Data: Defect Fixing Costs

Time Introduced

Time Detected

Req.DesignImpl.System Test

PostRelease

Req.135-101010-100

Design1101525-100

Impl.11010-25

• Source: Code Complete II (McConnell)

23

Think Ahead

24

Approach: UML Process

• Philosophy:– Software has a top-down structure– An optimal solution at stage n requires careful examination of all factors at stage n-1– Human readable documents are less prone to errors than source code– A picture is worth a thousand words

• Values– Measure twice cut once– Strive to prevent future defects

• Principles– Top-down– Divide & Conquer via careful design of interfaces– Abstraction: Each stage concentrates on a specific kind of information

• Practices– Analysis: Requirement gathering/Use cases– Architecture– Design– Implementation– Testing

25

Discussion: UML Process

• Distinguishes between design and programming

• Promoted formats for describing programs– Documents: SRS, TDD– Visual models: UML, BON

• These representation abstract away statement-level details– Considered to have minor affect on overall quality

• “Waterfall”: Measure twice cut once

26

UML Tools• Diagrams

– Class – Part – State-chart– Activity– Sequence– Deployment– Use case

• Code Generation from the UML model– Round-tripping

• Highlight:– Multiple abstractions– Use case diagrams are a formal description of the informal

notion of requirements

27

Iterative Waterfall

• Motivation: Requirements change

• Develop some of the program in UML process– All stages: Analysis, design, impl., …

• Repeat for some other part of the program

• Challenge: which part to choose in each iteration?

28

Design Documents: The Manufacturing Analogy

• Customer need a new something– Medicine, air-plane, yogurt, mobile-phone, …

• Experts prepare a rough sketch

• Engineers prepare a detailed blue-print

• Workers manufacture the product by executing the blue-print

• Analogy– Engineers are the software designers– The blue print is the UML model/design documents– Workers are the programmers

29

Software is not Manufactured• A medicine will be reproduced billions of times• Each instance must be identical to the other• => There’s a need for a precise blue-print

• Programmers need to “manufacture” a program only once– Reproduction is automatic (copying the executable)

• A fully detailed blue print is not really needed – If the programmer understands the designer’s intent, a simple phone

call is enough– Formalization may be a waste of time

• The code is the blue print

• The executable is the product

30

UML: The Building Architecture Analogy

• Customer wants to build a house

• Meets an architect, explain his needs

• Architect prepares a model

• customer approves

• Engineer prepares a construction plan– Addresses lower level issues, e.g., drainage, structures,

material

• Contractor executes the construction plan

31

Software is not a Building

• In Building, the costs of “undoing” are prohibitive– Hence “cut once, measure twice”

• In software, “measure twice” may be more expensive the “cut twice”

• The building model provides the customer with a faithful description of the building

• In software, use case diagrams and req. documents do not come close to a faithful description of the final system– Customer cannot provide an effective feedback– Chances of developing the wrong program are high

32

Criticism on the UML Process• How do you know when to stop?

– Even UML supporters agree it is not adequate for coding method and low level classes

– => There is a level where a plain-old compiler is better– => Optimal results require a mixture– => How do you know where is the break-even point?

• How do you which classes you need?– You start implementing in your head– Is it really more cost effective than implementing the real code?

• Much easier to express classes than state– Tendency to yield design w/ many similar classes even if these differences

can be easily expressed via state

• Over-engineering – Build a lot of flexibility into software– To prevent going back to the early stages (see next slide)

• Traceability

33

Over-Engineering

• Simple:– A class that traverses (pre-order) a tree of files/folders– Computes total size of all files

• Over-engineering– Compute something else– Iterate in a different order– Ignore certain files– Iterate over something other than files– Iterate over something that is not hierarchical

• YAGNI: You Are not Going to Need It

34

The Mathematics of YAGNI

• A tree of height 3, degree 3– Each third child is redundant (incl. subtree)– Total nodes: 13– Redundant nodes: 1+1+4=6 (46%)

• A tree of height h, degree d– Total nodes: s(h) = d*s(h-1) + 1, s(0) = 0– Redundant nodes: r(h) = s(h-1) + (d-1)*r(h-1)– => r(h) is O(s(h))

35

The Agile Manifesto

…we have come to value:

Individuals and interactions over processes and tools

Working software over comprehensive documentation

Customer collaboration over contract negotiationResponding to change over following a plan

36

The Importance of Time• A hypothetical programming task

• Approach one: 5 days• Approach two: 1 day

– Same interface– Code is not well structured (high-coupling, low-coherency)

• First approach– Minimal effort: 5 days

• Second approach– Effort: 1 (best case) – 6 (worst case) days

• Prefer the second approach– Tests will stay– Other team members can work on their parts– Sad scenario: you lost 1 day– Happy scenario: you earned at least 4 days

• So time is a key factor. Can we estimate development time?

37

Time Estimates: Physician Appointments

• My physician has an accurate schedule for at least in advance

• Method: Compute time per appointment– Evidence based estimation– Based on gathered statistical data, law of large numbers

• Properties of appointments– Countable– Identifiable end– Abundance

38

Time Estimates: A Software Project

• Time per class?– Not countable

• Time per sub-system?– Not abundant

• Features?– Countable (breakdown of the big task)– Identifiable end (write tests)– Abundant (by definition)

39

Burn Charts

• Time is important– (As shown in previous slide)– So, let’s describe our progress vs. time

• Vertical axis: tasks completed• Horizontal axis: time line

• Two variants: burn-up, burn-down

40

Burn Down

41

Burn Up

42

Burn Up Example

43

Quality in Software(new definition)

• A high-quality software is a software whose burn curve is linear

• Similar to Big-O notation of algorithms

• Does not distinguish between two linear curves– Differences in domain, languages, …

• States that a flattening is the #1 risk– Can be experienced even in student assignments

• Result oriented

44

Summary

• Time to completion is a key factor• Time estimation by features is

practical• Burn up charts show progress• Quality: Linear burn curve

Recommended