20
Dealing with Dealing with Software Complexity Software Complexity The Discovery of The Discovery of Structure in Programs Structure in Programs Bartosz Milewski

Dealing with Software Complexity

Embed Size (px)

DESCRIPTION

Dealing with Software Complexity. Bartosz Milewski. The Discovery of Structure in Programs. Software Development. Designing new code Understanding old code Written by others Written by current developer some time ago Maintenance starts from day two - PowerPoint PPT Presentation

Citation preview

Page 1: Dealing with Software Complexity

Dealing with Software Dealing with Software ComplexityComplexity

The Discovery of Structure in The Discovery of Structure in ProgramsPrograms

Bartosz Milewski

Page 2: Dealing with Software Complexity

Software DevelopmentSoftware Development

Designing new codeDesigning new code Understanding old codeUnderstanding old code

Written by othersWritten by others Written by current developer some Written by current developer some

time agotime ago

Maintenance starts from day twoMaintenance starts from day two Code understanding—the most Code understanding—the most

important and least automated important and least automated part of developmentpart of development

Page 3: Dealing with Software Complexity

Software UnderstandingSoftware Understanding

The trivial structureThe trivial structureLexicalLexicalGrammaticalGrammaticalAlphabetical list of classes and functionsAlphabetical list of classes and functionsDiagrams (Booch, etc.)Diagrams (Booch, etc.)

High-level structureHigh-level structureRelationships between classes (next slide)Relationships between classes (next slide)Design ideas, patternsDesign ideas, patternsEmerging structureEmerging structure

Page 4: Dealing with Software Complexity

Class DependenciesClass Dependencies

Class A uses class BClass A uses class B Strongly: it requires #include B.hStrongly: it requires #include B.h Weakly: it requires forward declaration of BWeakly: it requires forward declaration of B

Dependency graph analysisDependency graph analysis Levels of abstractionLevels of abstraction

Tree-like structure (next slide)Tree-like structure (next slide) Cycles and how to break themCycles and how to break them

Page 5: Dealing with Software Complexity

Hierarchy of ClassesHierarchy of Classes

Controller

ViewManager IpcQueue

CmdVector

Commander

Cmd::Table

Commander Commander

DisplayManagerSelectionManager

CmdVector

DisplayManager SelectionManager

Page 6: Dealing with Software Complexity

NeurologyNeurology

Brain can only work on tree-like structuresBrain can only work on tree-like structures There are 7 +/- 2 “registers”There are 7 +/- 2 “registers” There is a cache that must be pre-loadedThere is a cache that must be pre-loaded There is long-term, slow-search, memoryThere is long-term, slow-search, memory

Reality is infinite, brain is finite. Something must Reality is infinite, brain is finite. Something must be discarded.be discarded.

Abstracting—creating an item that can fit in a Abstracting—creating an item that can fit in a registerregister

Page 7: Dealing with Software Complexity

Programmer’s BrainProgrammer’s Brain

Context switching—re-loading the brain cache Context switching—re-loading the brain cache (multitasking is expensive)(multitasking is expensive)

Small program, easy to restart the brainSmall program, easy to restart the brain Maintenance requires code understanding—can Maintenance requires code understanding—can

take hours or days to fill the cache + a lot is left to take hours or days to fill the cache + a lot is left to debugging and testingdebugging and testing

Few tools to speed up cache loadingFew tools to speed up cache loading Needed: ready-made abstractions that fit the Needed: ready-made abstractions that fit the

registersregisters

Page 8: Dealing with Software Complexity

AbstractionsAbstractions

Subtracting “irrelevant” featuresSubtracting “irrelevant” features What are features?What are features? Which features are irrelevant?Which features are irrelevant?

Abstracting and CategorizingAbstracting and Categorizing Biology and evolution (next slide)Biology and evolution (next slide)

Page 9: Dealing with Software Complexity

The Origin of AbstractionThe Origin of Abstraction

Primitive organism has access to a Primitive organism has access to a featureless input stream—”reality stream”featureless input stream—”reality stream”Some things in the stream influence Some things in the stream influence

metabolism—selection for primitive detectorsmetabolism—selection for primitive detectorsData from detectors are “features”—evolution Data from detectors are “features”—evolution

decides which features are relevantdecides which features are relevantFirst abstractions: food and danger.First abstractions: food and danger.

Particular combinations of featuresParticular combinations of features

Page 10: Dealing with Software Complexity

Imitating LifeImitating Life

Cellular automataCellular automata Image processing, discovering features (lines, Image processing, discovering features (lines,

squares, faces)squares, faces)Genetic algorithms, trainingGenetic algorithms, training

Automata feeding on source codeAutomata feeding on source codeLexing automataLexing automataParsing automata—require infinite number of Parsing automata—require infinite number of

statesstates

Page 11: Dealing with Software Complexity

Scope DiscoveryScope Discovery

Cellular automata that can countCellular automata that can count2-d state space2-d state space

Vertical counts parenthesesVertical counts parenthesesHorizontal counts bracesHorizontal counts braces

Foo::Foo (Bar bar) : _bar (bar) { … { … g ( x ) …} … }

Page 12: Dealing with Software Complexity

Bubble DiagramsBubble Diagrams

Glue together matching parenthesesGlue together matching parentheses Glue together matching bracesGlue together matching braces

Foo::Foo (Bar bar) : _bar (bar) { … { … g ( x ) …} … }

Foo::Foo

Bar bar bar

: _bar

g

x

() ()

()

{}

{}

Shows nesting complexity

Bird’s-eye view

(indentation level)

Page 13: Dealing with Software Complexity

Document ProcessingDocument Processing

Pre-defined features: documents and words Pre-defined features: documents and words (word breakers)(word breakers)

Statistics: distribution of words among Statistics: distribution of words among documentsdocuments

Relevant words: words that occur more often in a Relevant words: words that occur more often in a given document than in the rest of the corpusgiven document than in the rest of the corpus

Page 14: Dealing with Software Complexity

ClusteringClustering

Documents with similar relevant word profiles Documents with similar relevant word profiles form clustersform clusters

Categorization of documents based of statistical Categorization of documents based of statistical featuresfeatures

Categorization—automatic generation of Categorization—automatic generation of abstractionsabstractions

Page 15: Dealing with Software Complexity

ClusteringClustering

Pick a representative set of N relevant Pick a representative set of N relevant words/phrases from the whole corpuswords/phrases from the whole corpus

Each document is a point in N-dim spaceEach document is a point in N-dim space Distance between documents in N dimDistance between documents in N dim Add gravitational attraction (potential)Add gravitational attraction (potential) Documents will start clustering just like galaxies in Documents will start clustering just like galaxies in

the early universethe early universe

Page 16: Dealing with Software Complexity

- music, composer, dance, ballet, dancer, choreographer, musical, piano, opera, folk. Orchestra, Russian, French, New York City, jazz, company, ballerina, song, melody, Italian

- English, poetry, poet, verse, volume, literature, poem, England, circle, prose, century, London, life, lyric, novel, language, love, author, john, romanticism

- god, philosopher, philosophy, mythology, Greek, goddess, old testament, human, existence, knowledge, new testament, theology, Jesus Christ, Immanuel Kant, evil, mind, book, son, Apollo, religion

Example of ClusteringExample of Clustering

Page 17: Dealing with Software Complexity

Statistics & AbstractionsStatistics & Abstractions

Nature gathers statistics in a very slow Nature gathers statistics in a very slow processprocess

Science is based on statisticsScience is based on statisticsPhysicsPhysicsMathMath

Statistical methods (clustering) in program Statistical methods (clustering) in program analysisanalysis

Page 18: Dealing with Software Complexity

Programs as DocumentsPrograms as Documents

High-level structure of programHigh-level structure of programReflects programmers’ ideasReflects programmers’ ideas Information encoded in vocabulary: names, Information encoded in vocabulary: names,

commentscommentsStrong influence of problem domainStrong influence of problem domainDepends on personal styleDepends on personal style

Can be processed like document corpusCan be processed like document corpusRelevant words, clustersRelevant words, clusters

Page 19: Dealing with Software Complexity

Statistical DiscoveryStatistical Discovery

Intended structures, for instance Model, View, Intended structures, for instance Model, View, ControllerController

Hidden structures, for instance separation of UI Hidden structures, for instance separation of UI from the “model”from the “model”

Horizontal structures, aspectsHorizontal structures, aspects Vertical structures, exceptions, exception Vertical structures, exceptions, exception

specificationspecification Copy and paste programming, code reuseCopy and paste programming, code reuse

Page 20: Dealing with Software Complexity

Integration with ToolsIntegration with Tools Overnight automatic program analysisOvernight automatic program analysis

Creating a map of abstractionsCreating a map of abstractions Creating various views (bubble diagrams, trees, Creating various views (bubble diagrams, trees,

layers)layers) Discovering hidden structuresDiscovering hidden structures

The ecosphere of a programThe ecosphere of a program Evolving automata, feeding on code and on statistical Evolving automata, feeding on code and on statistical

abstractionsabstractions Maintenance by programmerMaintenance by programmer

Encapsulating hidden structuresEncapsulating hidden structures Improving existing abstractionsImproving existing abstractions Adding new abstractionsAdding new abstractions