w9 Complexity

*Complexity metricsmeasure certain aspects of the software (lines of code, # of if-statements, depth of nesting, )use these numbers as a criterion to assess a design, or to guide the designinterpretation: higher value higher complexity more effort required (= worse design)two kinds:intra-modular: inside one moduleinter-modular: between modules

*Sized-based complexity measures

counting lines of code (intra-modular)differences in scale for different programming languages

Halsteads metrics: counting operators and operands

*Halsteads metrics:

n1: number of unique operatorsn2: number of unique operandsN1: total number of operatorsN2: total number of operands

*Example programpublic static void sort(int x []) {for (int i=0; i < x.length-1; i++) {for (int j=i+1; j < x.length; j++) {if (x[i] > x[j]) {int save=x[i];x[i]=x[j]; x[j]=save}}}}

operator, 1 occurrenceoperator, 2 occurrences

*operator # of occurrences publicsort()int[]{}for {;;}if ()= x[j]) {int save=x[i];x[i]=x[j]; x[j]=save}}}}

operand, 2 occurrencesoperand, 2 occurrences

*operand # of occurrences xlengthijsave01n2 = 79276212N2 = 29

*Metrics:size of vocabulary: n = n1 + n2program length: N = N1 + N2volume: V = N log2nlevel of abstraction: L = V*/ V, V*=volume of fct prototype. For sort(x), V* = 2 log 2. For main(), L is high.approximation: L = (2/n1)(n2/N2)programming effort: E = V/Lestimated programming time: T = E/18estimate of N: N = n1log2n1 + n2log2n2for this example: N = 68, N = 89, L = .015, L = .028

*Remember

Metrics are only estimates, give some idea on complexitycritique:different definitions of operand and operatorexplanations are not convincing (empirical, may not apply to other projects)Source code must exist

*Other metrics (structure-based)usecontrol structuresdata structuresor bothexample complexity measure based on data structures: average number of instructions between successive references to a variablebest known measure is based on the control structure: McCabes cyclomatic complexity

*Example programpublic static void sort(int x []) {for (int i=0; i < x.length-1; i++) {for (int j=i+1; j < x.length; j++) {if (x[i] > x[j]) {int save=x[i];x[i]=x[j]; x[j]=save}}}}

2134567891011

*Cyclomatic complexitye = number of edges (13)n = number of nodes (11)p = number of connected components (1)

CV = e - n + p + 1 (4)2134567891011

*Intra-modular complexity measures, summaryfor small programs, the various measures correlate well with programming timehowever, a simple length measure such as LOC does equally wellcomplexity measures are not very context sensitivecomplexity measures take into account few aspects

it might help to look at the complexity density instead

*System structure: inter-module complexitymeasures dependencies between modules:draw graph: modules =nodesedges connecting modules may denote several relations, most often: A uses B (ex: procedure calls)

*The uses relationthe call-graphchaos (general directed graph)hierarchy (acyclic graph)strict hierarchy (layers)tree

*In a picture:chaosstricthierarchyhierarchytree

*Measurements}size# nodes# edgeswidthheight

*Deviation from a treestricthierarchyhierarchytree

*Tree impurity metriccomplete graph with n nodes has e = n(n-1)/2 edges

a tree with n nodes has e = (n-1) edges

tree impurity for a graph with n nodes and e edges: m(G) = 2(e-n+1)/(n-1)(n-2)(0 for tree, 1 for complete graph)

*Any tree impurity metric:m(G) = 0 if and only if G is a tree

m(G1) > m(G2) if G1 = G2 + an extra edge

if G1 and G2 have the same # of extra edges wrt their spanning tree, and G1 has more nodes than G2, then m(G1) < m(G2)

m(G) m(Kn) = 1, where G has n nodes, and Kn is the (undirected) complete graph with n nodes

*Information flow metrictree impurity metrics only consider the number of edges, not their thicknessShepperds metric:there is a local flow from A to B if:A invokes B and passes it a parameterB invokes A and A returns a valuethere is a global flow from A to B if A updates some global structure and B reads that structure

*Shepperds metric

fan-in(M) = # (local and global) flows whose sink is Mfan-out(M) = # (local and global) flows whose source is Mcomplexity(M) = (fan-in(M) * fan-out(M))2

*Point to ponder:What does this program do?procedure X(A: array [1..n] of int);var i, k, small: int;beginfor i:= 1 to n dosmall:= A[i];for k:= i to n-1 doif small

*Object-oriented metrics

WMC: Weighted Methods per ClassDIT: Depth of Inheritance TreeNOC: Number Of ChildrenCBO: Coupling Between Object ClassesRFC: Response For a ClassLCOM: Lack of COhesion of a Method

*Weighted Methods per Class

measure for size of class

WMC = c(i), i = 1, , n (number of methods)

c(i) = complexity of method i

mostly, c(i) = 1

*Depth of Class in Inheritance TreeDIT = distance of class to root of its inheritance tree

Good: forest of classes of medium height

*Number Of Children

NOC: counts immediate descendants in inheritance tree

higher values NOC are considered bad:possibly improper abstraction of the parent classalso suggests that class is to be used in a variety of settings

*Coupling Between Object Classestwo classes are coupled if a method of one class uses a method or state variable of another class

CBO = count of all classes a given class is coupled with

high values: something is wrong

all couplings are counted alike; refinements are possible

*Response For a ClassRFC = size of the response set of a classresponse set = {set of methods: M} {set methods called by M1: R1} {R2} M1M3M2R1

*Lack of Cohesion of a Methodcohesion = glue that keeps the module (class) together, eg. all methods use the same set of state variablesif some methods use a subset of the state variables, and others use another subset, the class lacks cohesionLCOM = number of disjoint sets of methods in a classtwo methods in the same set share at least one state variable

*OO metrics

WMC, CBO, RFC, LCOM most usefulPredict fault proneness during designStrong relationship to maintenance effort

Many OO metrics correlate strongly with size

SE, Design, Hans van Vliet*How does one come to a design?

Existing knowledge of the designer. Experts have ~50000 chunks of useful knowledge (see also page 458-461).Chess example: if you show chess board with # of pieces on it to an experienced chess player and a non-player, the player will do much better in reconstructing the board and pieces. But only if the configuration is a meaningful one. The same is true with programmers. A sequence of random lines of code is reproduced equally well by programmers and non-programmers. Surprisingly, programmers may reproduce a program with the same semantics, but use e.g. different variable names. So, from this example, they may just remember: SORT.

So this also plays a role in education: learn students more useful pieces of knowledge. SE, Design, Hans van Vliet

Documents

w9 Complexity