View
213
Download
0
Tags:
Embed Size (px)
Citation preview
UC Berkeley CS169, November 1, 2007
Software Development in an Academic Environment: Lessons learned and not learned
Christopher BrooksCHESS Executive Director
With material from:
Edward A. Lee
H. John Reekie
Nov 1, 2008 UCB CS169Software Development in an Academic Environment 2
Intros
• Who is graduating this semester?– Going to industry?– Going to academia?– Don’t know?– Don’t care?
• Who is graduating in 2008?– Going to industry?– Going to academia?– Don’t know?– Don’t care?
• Languages– Java?– C#
Nov 1, 2008 UCB CS169Software Development in an Academic Environment 3
Christopher Brooks
• I’m a release engineer, training electrical engineers in the art of software engineering.
• I’ve worked with Professor Edward A. Lee since 1992, first on Ptolemy Classic (C++) and now on Ptolemy II (Java).
• I took CS 169 with Prof Brewer in mid-90’s, before the .com era
Nov 1, 2008 UCB CS169Software Development in an Academic Environment 4
Software Engineering at Berkeley
• No Formal Program: Part of CS• Survey of CS Baccalaureate grads: • 2003-2006: 196 respond of 409 grads• Of the 196:
– 127 (65%) employed– 39 (20%) in grad school – 18 (9%) seeking employment– 12 (6%) Other Endeavors
• 97 Employer/Titles listed• 30 “Software Engineers”• 59 Titles contain the word “Engineer”
Source: http://career.berkeley.edu/Major/CompSci.stm
Nov 1, 2008 UCB CS169Software Development in an Academic Environment 5
Nov 1, 2008 UCB CS169Software Development in an Academic Environment 6
Lower Division Computer Science Prerequisites
• CS 61A (Structure and Interpretation of Computer Programs), 61B (Data Structures), 61C (Machine Structures)
• Math 1A and Math 1B (can be satisfied with Advanced Placement),
• Math 54 (Linear Algebra and Differential Equations)
• CS 70 (Discrete Mathematics and Probability Theory)
• EECS 42 (Electronics). (We highly recommend taking EECS 43, a one-unit laboratory course taken P/NP, during the same semester as EECS 42.)
Source: http://www.eecs.berkeley.edu/csugrad/index.shtml
Nov 1, 2008 UCB CS169Software Development in an Academic Environment 7
Required Courses for Satisfaction of the CS Major
• L&S CS majors must earn 27 units in upper division technical courses, including:
• Required Course: CS 170 (Algorithms) • Breadth courses choose two from the following:
– CS 160. User Interfaces – CS 161. Computer Security – CS 162. Operating Systems and System Programming – CS 164. Programming Languages and Compilers – CS 169. Software Engineering – CS 184. Foundations of Computer Graphics – CS 186. Introduction to Databases
• Any two additional Computer Science courses. • Technical electives.
– Including Undergraduate Business Administration • UGBA 103 Introduction to Finance (Prereq: UGBA 101A) • UGBA 119 Strategic Planning (101A-101B, 102A-102B, 103, 105, and senior
standing. )• UGBA 140 Introduction to Management Science ??• UGBA 146 Planning and Design of E-Business Sys (Prereg: CS3)• UGBA 152 Negotiation and Conflict Resolution (Prereg: UGBA 105)
Nov 1, 2008 UCB CS169Software Development in an Academic Environment 8
Software Engineering
• Not much coding– Mostly bug fixing
• More management as time goes on– Team Dynamics and Virtual Teams– Project Management
Nov 1, 2008 UCB CS169Software Development in an Academic Environment 9
Classes I wish I had taken
• Group Dynamics Classes– How do teams operate
• Intellectual Property- Patents- Copyrights
• Project Management
Nov 1, 2008 UCB CS169Software Development in an Academic Environment 10
PMI: Project Management Institute
• “A Guide to the Project Management Body of Knowledge “ (aka PMBOK)
• PMI Certification– Certified Associate in Project Management
(CAPM)– Project Management Professional (PMP)– Program Management Professional (PgMP)
Nov 1, 2008 UCB CS169Software Development in an Academic Environment 11
Ptolemy II:
• Ptolemy II: Set of Java packages supporting– heterogeneous, – concurrent modeling, – simulation, and – design of component-based systems.
• The kernel: definition and manipulation of clustered hierarchical graphs, which are collections of entities and relations between those entities.
• The actor package extends the kernel so that entities have functionality and can communicate via the relations.
• The domains extend the actor package by imposing models of computation on the interaction between entities.
Nov 1, 2008 UCB CS169Software Development in an Academic Environment 12
Ptolemy II: Our Laboratory for Experiments with Models of Computation
Director from a library defines component interaction semantics
Large, behaviorally-polymorphic component library.
Visual editor supporting an abstract syntax
Type system for transported data
Concurrency management supporting dynamic model structure.
Source: Edward A. Lee
Nov 1, 2008 UCB CS169Software Development in an Academic Environment 13
Ptolemy II: Functionality of Components is Given in C or Java (which can wrap C, C++, Perl, Python, MATLAB, Web services, Grid services, …)
Source: Edward A. Lee
Nov 1, 2008 UCB CS169Software Development in an Academic Environment 14
Example: Discrete Event Models
DE Director implements timed semantics using an event queue
Event source
Time line
Reactive actors
Signal
Components send time-stamped events to other components, and components react in chronological order. Source: Edward A. Lee
Nov 1, 2008 UCB CS169Software Development in an Academic Environment 15
Production vs. Research
Source: http://www.esi.es/Families/E1.4b-Method-Catalogue/CAFE/Details1-v0.1.html
Nov 1, 2008 UCB CS169Software Development in an Academic Environment 16
Extreme Programming
Nov 1, 2008 UCB CS169Software Development in an Academic Environment 17
Nightly Build
• Build and test the system regularly– Every night
• Why? Because it is easier to fix problems earlier than later– Easier to find the cause after one change than after 1,000
changes– Avoids new code from building on the buggy code
• Aiken: Test is usually subset of full regression test– “smoke test”– Just make sure there is nothing horribly wrong
• Keutzer: I disagree with this point. Typical case should be to run entire regression test
• Jim McCarthy (Director of MSVC++ Group): “If you build it, it will ship”
• Build a release every night, run tests – makes integration easier.
Aiken
Nov 1, 2008 UCB CS169Software Development in an Academic Environment 18
Ptolemy II Nightly Build
– Ptolemy II has ~6700 tests for ~2100 Java filescontain 675,000 lines of code.
Nov 1, 2008 UCB CS169Software Development in an Academic Environment 19
Code Coverage
• The fireOneRound() method is not covered
Nov 1, 2008 UCB CS169Software Development in an Academic Environment 20
Coding Style Features
• Code should have a consistent style– Decided by one person (Professor, CTO)– Enforced by a tool
• Document using complete sentences with good grammar– Be nice to yourself and others that use your
code.• Identifiers use complete words
(CamelCase)– This aids in readability and accessibility – the
developer knows that the variable or method is numberOfEspressos, not numEsp or n.
Nov 1, 2008 UCB CS169Software Development in an Academic Environment 21
Testing Documentation: doccheck
Doccheck is a javadoc plug-in from Sun that points out common problems.
Nov 1, 2008 UCB CS169Software Development in an Academic Environment 22
Regression Tests: Test Plan
• Strategy: A good strategy would be?• Goals:
– Ensure basic functionality according to the Design Spec.
– Ensure that the functionality does not regress (i.e. previously fixed bugs do not reappear or cause new bugs).
– Focus on stress, capacity, and boundary conditions.
• Scope: Regression testing targets commands or program functions.– All commands and / or programs functions
should be covered by regression tests. – List the commands and / or program functions
which will be tested.
Nov 1, 2008 UCB CS169Software Development in an Academic Environment 23
Design and Code Reviews
• Objective is “publishable software”• Defined roles for participants
– Author has the last word
• Mechanism for new group members to learn to differentiate good from bad software.
All technical reviews are based on the idea that developers are blind to some of the trouble spots in their work...
-Steve McConnellJohn Reekie and the Ptolemy team
Nov 1, 2008 UCB CS169Software Development in an Academic Environment 24
Code Rating
• A simple framework for– quality improvement by
peer review– change control by
improved visibility• Four confidence levels
– Red. No confidence at all.
– Yellow. Passed design review. Soundness of the APIs.
– Green. Passed code review. Quality of implementation.
– Blue. Passed final review. Backwards-compatibility assurance.
• What is this about really?–Confidence in
quality–Commitment to
stability
Source: John Reekie
Nov 1, 2008 UCB CS169Software Development in an Academic Environment 25
Orca and testing
• Orca “an open-source framework for developing component-based robotic systems “(http://orca-robotics.sourceforge.net/)
• Orca uses CMake (http://www.cmake.org)to configure the system
• Orca uses the Dart2 Dashboard to show nightly build output http://129.78.210.237:8081/orca2/Dashboard/
• Orca uses CTest, part of CMake for testing. http://wiki2.cas.edu.au/orca/index.php/Orca:faq:general:testing:write:tests
Nov 1, 2008 UCB CS169Software Development in an Academic Environment 26
Orca Dashboard http://129.78.210.237:8081/orca2/Dashboard/
Nov 1, 2008 UCB CS169Software Development in an Academic Environment 27
Orca Dashboard Coverage
Nov 1, 2008 UCB CS169Software Development in an Academic Environment 28
Orca Dashboard Coverage Detail
Nov 1, 2008 UCB CS169Software Development in an Academic Environment 29
Orca Dashboard Coverage Detail
Nov 1, 2008 UCB CS169Software Development in an Academic Environment 30
Lessons not learned: Process Problems
• Review process decayed• No read-ahead
– This is a walkthrough, not a review
• No follow up
Nov 1, 2008 UCB CS169Software Development in an Academic Environment 31
Automatic Tools
• Static code checkers– gcc– Coverity – not Free– Java tools like FindBugs and PMD
• Memory checkers– Electric Fence (for C)– Purify – not Free
Nov 1, 2008 UCB CS169Software Development in an Academic Environment 32
Software Tools
• JUnit (xUnit)• Eclipse• Maven• CMake/CTest• Ant – be fully buzzword compliant, but
know how to use make.
Nov 1, 2008 UCB CS169Software Development in an Academic Environment 33
Lessons not Learned: Threads
Nov 1, 2008 UCB CS169Software Development in an Academic Environment 34
The Problem with Threads
• Edward A. Lee: IEEE Computer, May 2006 article
• “For concurrent programming to become mainstream, . . .
• we must discard threads as a programming model”.
• “Nondeterminism should be judiciously and carefully introduced when needed, . . .
• and it should be explicit in programs.”
Nov 1, 2008 UCB CS169Software Development in an Academic Environment 35
To See That Current Practice is Bad, Consider a Simple Example
“The Observer pattern defines a one-to-many dependency between a subject object and any number of observer objects so that when the subject object changes state, all its observer objects are notified and updated automatically.”
Design Patterns, Eric Gamma, Richard Helm, Ralph Johnson, John Vlissides (Addison-Wesley Publishing Co., 1995. ISBN: 0201633612):
Nov 1, 2008 UCB CS169Software Development in an Academic Environment 36
Observer Pattern
Source: Wikipedia
Nov 1, 2008 UCB CS169Software Development in an Academic Environment 37
Observer Pattern in Java
public void addListener(listener) {…}
public void setValue(newValue) { myValue = newValue;
for (int i = 0; i < myListeners.length; i++) { myListeners[i].valueChanged(newValue) }
}
Thanks to Mark S. Miller for the details of this example.
Will this work in a multithreaded context?
Nov 1, 2008 UCB CS169Software Development in an Academic Environment 38
Observer PatternWith Mutual Exclusion (Mutexes)
public synchronized void addListener(listener) {…}
public synchronized void setValue(newValue) { myValue = newValue;
for (int i = 0; i < myListeners.length; i++) { myListeners[i].valueChanged(newValue) }
} Javasoft recommends against this. What’s wrong with it?
See also Allen Holub’s Java World Article “The Observer pattern and mysteries of the AWTEventMulticaster” http://www.javaworld.com/javaworld/jw-03-1999/jw-03-toolbox.html
Nov 1, 2008 UCB CS169Software Development in an Academic Environment 39
Mutexes are Minefields
public synchronized void addListener(listener) {…}
public synchronized void setValue(newValue) { myValue = newValue;
for (int i = 0; i < myListeners.length; i++) { myListeners[i].valueChanged(newValue) }
}valueChanged() may attempt to acquire a lock on some other object and stall. If the holder of that lock calls addListener(), deadlock!
Nov 1, 2008 UCB CS169Software Development in an Academic Environment 40
After years of use without problems, a Ptolemy Project code review found code that was not thread safe. It was fixed in this way. Three days later, a user in Germany reported a deadlock that had not shown up in the test suite.
Nov 1, 2008 UCB CS169Software Development in an Academic Environment 41
Simple Observer Pattern BecomesNot So Simple
public synchronized void addListener(listener) {…}
public void setValue(newValue) { synchronized(this) { myValue = newValue; listeners = myListeners.clone(); }
for (int i = 0; i < listeners.length; i++) { listeners[i].valueChanged(newValue) }
}
while holding lock, make copy of listeners to avoid race conditions
notify each listener outside of synchronized block to avoid deadlock
This still isn’t right.What’s wrong with it?
Nov 1, 2008 UCB CS169Software Development in an Academic Environment 42
Simple Observer Pattern:How to Make It Right?
public synchronized void addListener(listener) {…}
public void setValue(newValue) { synchronized(this) { myValue = newValue; listeners = myListeners.clone(); }
for (int i = 0; i < listeners.length; i++) { listeners[i].valueChanged(newValue) }
}Suppose two threads call setValue(). One of them will set the value last, leaving that value in the object, but listeners may be notified in the opposite order. The listeners may be alerted to the value changes in the wrong order!
Nov 1, 2008 UCB CS169Software Development in an Academic Environment 43
If the simplest design patterns yield such problems, what about non-trivial designs?/**CrossRefList is a list that maintains pointers to other CrossRefLists.…@author Geroncio Galicia, Contributor: Edward A. Lee@version $Id: CrossRefList.java,v 1.78 2004/04/29 14:50:00 eal Exp $@since Ptolemy II [email protected] Green (eal)@Pt.AcceptedRating Green (bart)*/public final class CrossRefList implements Serializable { … protected class CrossRef implements Serializable{ … // NOTE: It is essential that this method not be // synchronized, since it is called by _farContainer(), // which is. Having it synchronized can lead to // deadlock. Fortunately, it is an atomic action, // so it need not be synchronized. private Object _nearContainer() { return _container; }
private synchronized Object _farContainer() { if (_far != null) return _far._nearContainer(); else return null; } … }}
Code that had been in use for four years, central to Ptolemy II, with an extensive test suite with 100% code coverage, design reviewed to yellow, then code reviewed to green in 2000, causes a deadlock during a demo on April 26, 2004.
Nov 1, 2008 UCB CS169Software Development in an Academic Environment 44
Edward Lee’s Claim
Nontrivial concurrent software written with threads is incomprehensible to humans and cannot be trusted!
Maybe better abstractions would lead to better practice…
Nov 1, 2008 UCB CS169Software Development in an Academic Environment 45
Succinct Problem Statement
Threads are wildly nondeterministic.
The programmer’s job is to prune away the nondeterminism by imposing constraints on execution order (e.g., mutexes) and limiting shared data accesses (e.g., OO design).
Nov 1, 2008 UCB CS169Software Development in an Academic Environment 46
Perhaps Concurrency is Just Hard…
Sutter and Larus observe:
“humans are quickly overwhelmed by concurrency and find it much more difficult to reason about concurrent than sequential code. Even careful people miss possible interleavings among even simple collections of partially ordered operations.”
H. Sutter and J. Larus. Software and the concurrency revolution. ACM Queue, 3(7), 2005.
Nov 1, 2008 UCB CS169Software Development in an Academic Environment 47
If concurrency were intrinsically hard, we would not function well in the physical world
It is not concurrency that is hard…
Nov 1, 2008 UCB CS169Software Development in an Academic Environment 48
…It is Threads that are Hard!
Threads are sequential processes that share memory. From the perspective of any thread, the entire state of the universe can change between any two atomic actions (itself an ill-defined concept).
Imagine if the physical world did that…
Nov 1, 2008 UCB CS169Software Development in an Academic Environment 49
Yet threads are the basis for all widely used concurrency models, as well as the basis for I/O interactions and network interactions in modern computers.
Nov 1, 2008 UCB CS169Software Development in an Academic Environment 50
Succinct Solution Statement
Instead of starting with a wildly nondeterministic mechanism and asking the programmer to rein in that nondeterminism, start with a deterministic mechanism and incrementally add nondeterminism where needed.
The question is how to do this and still get concurrency.
Nov 1, 2008 UCB CS169Software Development in an Academic Environment 51
Actor-Oriented Design
The alternative: “Actor oriented:”
actor name
data (state)
ports
Input data
parameters
Output data
What flows through an object is
evolving data
class name
data
methods
call return
What flows through an object is
sequential control
The established: Object-oriented:
Things happen to objects
Actors make things happen
Nov 1, 2008 UCB CS169Software Development in an Academic Environment 52
The First (?) Actor-Oriented Programming LanguageThe On-Line Graphical Specification of Computer ProceduresW. R. Sutherland, Ph.D. Thesis, MIT, 1966
MIT Lincoln Labs TX-2 Computer Bert Sutherland with a light pen
Partially constructed actor-oriented model with a class definition (top) and instance (below).
Bert Sutherland used the first acknowledged object-oriented framework (Sketchpad, created by his brother, Ivan Sutherland) to create the first actor-oriented programming language (which had a visual syntax).
Nov 1, 2008 UCB CS169Software Development in an Academic Environment 53
Examples of Actor-Oriented Coordination Languages
• CORBA event service (distributed push-pull)• ROOM and UML-2 (dataflow, Rational, IBM)• VHDL, Verilog (discrete events, Cadence, Synopsys, ...)• LabVIEW (structured dataflow, National Instruments)• Modelica (continuous-time, constraint-based, Linkoping)• OPNET (discrete events, Opnet Technologies)• SDL (process networks)• Occam (rendezvous)• Ptolemy (various, Berkeley)• Simulink (Continuous-time, The MathWorks)• SPW (synchronous dataflow, Cadence, CoWare)• …
Many of these are domain specific.
Many of these have visual syntaxes.
The semantics of these differ considerably, but all can be modeled as
with appropriate choices of the set T.
Nov 1, 2008 UCB CS169Software Development in an Academic Environment 54
11 Steps to successfully completing a software project
1. Create a one page charter2. Separation of concerns: MVC: gui vs backend3. Start writing tests early, use a code coverage tool4. Use a nightly build5. Use a consistent coding style and use a tool to enforce the
style6. Use tools: memory leaks, warnings, spelling errors,
performance problems, other compilers, other operating systems.
7. Document your code. Writing documentation first can prevent hours of wasted time.
8. Don’t debug for more than an hour by yourself – get help.9. Design Review and Code Review (or at least desk check)10. Expect the unexpected: wacky user input, wacky user
interaction11. Don’t be afraid to throw away code and start over.