31
libMesh: Lessons in Distributed Collaborative Design and Development Roy H. Stogner 1 John W. Peterson 2 1 The University of Texas at Austin 2 Idaho National Laboratory Feb 26, 2013 Roy H. Stogner, John W. Peterson Distributed Development Feb 26, 2013 1 / 31

libMesh: Lessons in Distributed Collaborative Design and ...roystgnr/libmesh_cse13_talk-talk.pdf · libMesh: Lessons in Distributed Collaborative Design and Development Roy H. Stogner1

  • Upload
    others

  • View
    10

  • Download
    0

Embed Size (px)

Citation preview

Page 1: libMesh: Lessons in Distributed Collaborative Design and ...roystgnr/libmesh_cse13_talk-talk.pdf · libMesh: Lessons in Distributed Collaborative Design and Development Roy H. Stogner1

libMesh: Lessons in Distributed Collaborative Designand Development

Roy H. Stogner1 John W. Peterson2

1The University of Texas at Austin

2Idaho National Laboratory

Feb 26, 2013

Roy H. Stogner, John W. Peterson Distributed Development Feb 26, 2013 1 / 31

Page 2: libMesh: Lessons in Distributed Collaborative Design and ...roystgnr/libmesh_cse13_talk-talk.pdf · libMesh: Lessons in Distributed Collaborative Design and Development Roy H. Stogner1

Introduction

Outline

1 Introduction

2 Collaboration Strategies

3 API Development

4 Source Code ControlAttention Deficit DevelopmentLinear and Nonlinear History

5 Development and Testing

6 Build Systems

Roy H. Stogner, John W. Peterson Distributed Development Feb 26, 2013 2 / 31

Page 3: libMesh: Lessons in Distributed Collaborative Design and ...roystgnr/libmesh_cse13_talk-talk.pdf · libMesh: Lessons in Distributed Collaborative Design and Development Roy H. Stogner1

Introduction

libMesh Finite Element Library

Scope• Open source, free to download

I LGPL

• 13 Ph.D. theses, 186 papers (30in 2012)

• ∼ 10 current developers

• O (100) current users?

Challenges• Widely dispersed core developers

I INL, UT-Austin, JSC, MIT, Harvard,Argonne

• ITAR, Commercial applications common

• Radically different application types

Roy H. Stogner, John W. Peterson Distributed Development Feb 26, 2013 3 / 31

Page 4: libMesh: Lessons in Distributed Collaborative Design and ...roystgnr/libmesh_cse13_talk-talk.pdf · libMesh: Lessons in Distributed Collaborative Design and Development Roy H. Stogner1

Collaboration Strategies

Outline

1 Introduction

2 Collaboration Strategies

3 API Development

4 Source Code ControlAttention Deficit DevelopmentLinear and Nonlinear History

5 Development and Testing

6 Build Systems

Roy H. Stogner, John W. Peterson Distributed Development Feb 26, 2013 4 / 31

Page 5: libMesh: Lessons in Distributed Collaborative Design and ...roystgnr/libmesh_cse13_talk-talk.pdf · libMesh: Lessons in Distributed Collaborative Design and Development Roy H. Stogner1

Collaboration Strategies

Collaboration Strategies

Communication• Face to face, instant messaging, teleconference• Email lists

I [email protected],[email protected]

• Trac tickets, Redmine issues

• SourceForge, GitHub issues

Code• Email attachments• Ticket attachments

I Repository forks!I Pull requests!

Roy H. Stogner, John W. Peterson Distributed Development Feb 26, 2013 5 / 31

Page 6: libMesh: Lessons in Distributed Collaborative Design and ...roystgnr/libmesh_cse13_talk-talk.pdf · libMesh: Lessons in Distributed Collaborative Design and Development Roy H. Stogner1

API Development

Outline

1 Introduction

2 Collaboration Strategies

3 API Development

4 Source Code ControlAttention Deficit DevelopmentLinear and Nonlinear History

5 Development and Testing

6 Build Systems

Roy H. Stogner, John W. Peterson Distributed Development Feb 26, 2013 6 / 31

Page 7: libMesh: Lessons in Distributed Collaborative Design and ...roystgnr/libmesh_cse13_talk-talk.pdf · libMesh: Lessons in Distributed Collaborative Design and Development Roy H. Stogner1

API Development

Tracking API Changes

API versions easily proliferate...#if PETSC_VERSION_LESS_THAN(3,1,0)

ierr = MatGetSubMatrix(matrix->mat(),

_restrict_to_is,_restrict_to_is_complement,

PETSC_DECIDE,MAT_INITIAL_MATRIX,&submat1);

CHKERRABORT(libMesh::COMM_WORLD,ierr);

#else

ierr = MatGetSubMatrix(matrix->mat(),

_restrict_to_is,_restrict_to_is_complement,

MAT_INITIAL_MATRIX,&submat1);

CHKERRABORT(libMesh::COMM_WORLD,ierr);

#endif

• Maintain a wide range of external compatibility

• Limit libMesh API changes

Roy H. Stogner, John W. Peterson Distributed Development Feb 26, 2013 7 / 31

Page 8: libMesh: Lessons in Distributed Collaborative Design and ...roystgnr/libmesh_cse13_talk-talk.pdf · libMesh: Lessons in Distributed Collaborative Design and Development Roy H. Stogner1

API Development

Signaling API Changes

Development practices• Old, new APIs overlap• Easier with C++ function overloading, default arguments

I Adding f(a,b) does not preclude keeping f(a)I Adding f(a,b=default) can replace f(a)

Runtime warnings• libmesh experimental() (in-flux APIs)

• libmesh deprecated() ( 1 year, 1-2 releases)

Examples• OStringStream workaround class

• Parallel:: global functions

Roy H. Stogner, John W. Peterson Distributed Development Feb 26, 2013 8 / 31

Page 9: libMesh: Lessons in Distributed Collaborative Design and ...roystgnr/libmesh_cse13_talk-talk.pdf · libMesh: Lessons in Distributed Collaborative Design and Development Roy H. Stogner1

Source Code Control

Outline

1 Introduction

2 Collaboration Strategies

3 API Development

4 Source Code ControlAttention Deficit DevelopmentLinear and Nonlinear History

5 Development and Testing

6 Build Systems

Roy H. Stogner, John W. Peterson Distributed Development Feb 26, 2013 9 / 31

Page 10: libMesh: Lessons in Distributed Collaborative Design and ...roystgnr/libmesh_cse13_talk-talk.pdf · libMesh: Lessons in Distributed Collaborative Design and Development Roy H. Stogner1

Source Code Control

• When discussing SCC software, the distinction between “distributed”and “centralized” is often stressed, perhaps unnecessarily.

• Distributed SCC software, like git, is very frequently used in asemi-centralized manner.

• The libMesh library is now distributed from GitHub1, and thereforewe focus on git in this talk, but the discussion should apply to otherSCC software as well.

1https://github.com/libMesh/libmesh

Roy H. Stogner, John W. Peterson Distributed Development Feb 26, 2013 10 / 31

Page 11: libMesh: Lessons in Distributed Collaborative Design and ...roystgnr/libmesh_cse13_talk-talk.pdf · libMesh: Lessons in Distributed Collaborative Design and Development Roy H. Stogner1

Source Code Control Attention Deficit Development

• A more intrinsic difference between various flavors of SCC software israther the ability to make “local commits.”

• git and other “distributed” SCC software packages (hg) have thisfeature.

• SVN lacks this feature, and therefore makes work interruptions (whichcan be rather frequent in collaborative development) difficult tohandle.

Roy H. Stogner, John W. Peterson Distributed Development Feb 26, 2013 11 / 31

Page 12: libMesh: Lessons in Distributed Collaborative Design and ...roystgnr/libmesh_cse13_talk-talk.pdf · libMesh: Lessons in Distributed Collaborative Design and Development Roy H. Stogner1

Source Code Control Attention Deficit Development

• Consider the following scenario:I You are working on a new feature and have several locally-modified

files (“A”, “D”, or “M” state in svn status)I You receive email from a collaborator about a bug fix he’d like you to

test ASAP. His patch may or may not conflict with your current set ofchanges.

• What do you do?

Roy H. Stogner, John W. Peterson Distributed Development Feb 26, 2013 12 / 31

Page 13: libMesh: Lessons in Distributed Collaborative Design and ...roystgnr/libmesh_cse13_talk-talk.pdf · libMesh: Lessons in Distributed Collaborative Design and Development Roy H. Stogner1

Source Code Control Attention Deficit Development

• In SCC software without local commits, your choices are:1 Make a patch of your local changes (e.g. svn diff), revert them, and

hope to come back to them later.2 See if your collaborator’s patch applies cleanly on top of what you are

already doing.3 Create a fresh checkout, apply the patch, recompile everything, and

test.

• The choices aren’t pretty:1 This is manual source code control, something tools should help you

avoid!2 If the patch program fails, the results can be cryptic; if patch succeeds,

it may be hard to revert later.3 This approach clearly doesn’t scale in disk space or CPU cycles.

Roy H. Stogner, John W. Peterson Distributed Development Feb 26, 2013 13 / 31

Page 14: libMesh: Lessons in Distributed Collaborative Design and ...roystgnr/libmesh_cse13_talk-talk.pdf · libMesh: Lessons in Distributed Collaborative Design and Development Roy H. Stogner1

Source Code Control Attention Deficit Development

• In SCC software with local commits, specifically git, and assumingyou are working on my-branch, you:

I git ci your work.I Create a new branch, probably from master.I Apply your collaborator’s patch, let him know what you find.I git co my-branch

• Once you are back on my-branch, you can do a “soft reset” to getback to exactly where you were before the interruption.

• If you don’t want to mess with extra branches, you can instead git

stash what you’re currently doing, try out your collaborator’s patch,and git stash pop to return to your original state.

Roy H. Stogner, John W. Peterson Distributed Development Feb 26, 2013 14 / 31

Page 15: libMesh: Lessons in Distributed Collaborative Design and ...roystgnr/libmesh_cse13_talk-talk.pdf · libMesh: Lessons in Distributed Collaborative Design and Development Roy H. Stogner1

Source Code Control Linear and Nonlinear History

• The first question a git-based development team2 should debate iswhether maintaining a “linear” history is desirable/important.

• There are pros and cons to both linear and nonlinear developmenthistories.

• The answer probably lies somewhere between “rigidly-enforcedlinearity” and “merges gone wild.”

2Especially teams transitioning from SVN.Roy H. Stogner, John W. Peterson Distributed Development Feb 26, 2013 15 / 31

Page 16: libMesh: Lessons in Distributed Collaborative Design and ...roystgnr/libmesh_cse13_talk-talk.pdf · libMesh: Lessons in Distributed Collaborative Design and Development Roy H. Stogner1

Source Code Control Linear and Nonlinear History

Example - Useful Nonlinearity

* 4df7f73 Adding list of bibtex templates.

* e04db6d Merge pull request #45 from benkirk/eigen

|\

| * e3bd55d get contributed Eigen into build system

| * 13fa33d add eigen-3.1.2 unsupported API

| * d03f946 adding eigen-3.1.2 to contrib

|/

* 1249c5d more fine-grained fallback for --disable-mpi

* e15fef7 use <rpc/xdr.h> when it is there

• A (short-lived) feature branch is created, committed to, and merged backinto master.

• Preserves the context in which development took place. Useful!

Roy H. Stogner, John W. Peterson Distributed Development Feb 26, 2013 16 / 31

Page 17: libMesh: Lessons in Distributed Collaborative Design and ...roystgnr/libmesh_cse13_talk-talk.pdf · libMesh: Lessons in Distributed Collaborative Design and Development Roy H. Stogner1

Source Code Control Linear and Nonlinear History

Example - Rigid Linearity

* bc56be9 Fixes for our Epetra vector interface

* 4f2b016 Making reading work, adding support ...

* 243753e Again, don’t degrade to single precision ...

* b277d0a We are using a vtkDoubleArray, so don’t ...

* 6bac31a Hoist function calls out of loop conditionals.

* fe85fae Standardizing spacing, formatting, indentation, etc.

* ce703f9 Use VTK_LEGACY_REMOVE. Thanks, cato-, for the idea.

* 84da4b4 prevent netcdf from running most of the ...

• Commits fe85fae — 4f2b016 are a group of logically-connected changes.

• This information is lost because the author did his development directly onmaster instead of branching.

Roy H. Stogner, John W. Peterson Distributed Development Feb 26, 2013 17 / 31

Page 18: libMesh: Lessons in Distributed Collaborative Design and ...roystgnr/libmesh_cse13_talk-talk.pdf · libMesh: Lessons in Distributed Collaborative Design and Development Roy H. Stogner1

Source Code Control Linear and Nonlinear History

Example - Misleading Nonlinearity

* cfd23fa Merge branch master

|\

| * 285ebaa Adding citations and webpage generation script.

* | 1aa5d5f trump --enable-petsc with --disable-mpi

* | 9644c5f fallback to rpc/xdr.h when looking for xdr.

|/

* 070515a LibMeshInit can accept more argument constness

• The three “middle” commits are unrelated to one another.

• The author of 9644c5f and 1aa5d5f ran git pull, bringing down anunrelated change, and producing a merge commit.

• Branch does not preserve any particular development context.

Roy H. Stogner, John W. Peterson Distributed Development Feb 26, 2013 18 / 31

Page 19: libMesh: Lessons in Distributed Collaborative Design and ...roystgnr/libmesh_cse13_talk-talk.pdf · libMesh: Lessons in Distributed Collaborative Design and Development Roy H. Stogner1

Source Code Control Linear and Nonlinear History

Example - Merges Gone Wild

* 9f639a6 Merge branch master

|\

| * 936e197 Merge branch master

| |\

| | * 2b80c18 Remove uninitialized dphi warnings ...

| * | 0309eac is_adjoint() bugfixes

| |/

* | 514052e UnsteadySolver fixes and optimization

* | e5aac7b Merge branch master

|\ \

| |/

| * b6155a5 Changes in quadrature_simpson_3D.C for -Wshadow.

• Three “merge” commits and four “real” commits.

• High signal-to-noise ratio.

• git bisect often stops at merge commits.

Roy H. Stogner, John W. Peterson Distributed Development Feb 26, 2013 19 / 31

Page 20: libMesh: Lessons in Distributed Collaborative Design and ...roystgnr/libmesh_cse13_talk-talk.pdf · libMesh: Lessons in Distributed Collaborative Design and Development Roy H. Stogner1

Source Code Control Linear and Nonlinear History

Current Guidelines

• Strive for “useful nonlinearity.”

• Develop separate feature sets on separate branches; merge themback to master when complete.

• Minimize or eliminate periodic/unnecessary merge commits.

• Instead, rebase feature branches on top of master before mergingand pushing

• Rebasing public (aka shared) branches is badTM, so wait until you areready to push, branch from the shared branch locally, rebase it on topof master, and then merge it.

Roy H. Stogner, John W. Peterson Distributed Development Feb 26, 2013 20 / 31

Page 21: libMesh: Lessons in Distributed Collaborative Design and ...roystgnr/libmesh_cse13_talk-talk.pdf · libMesh: Lessons in Distributed Collaborative Design and Development Roy H. Stogner1

Source Code Control Linear and Nonlinear History

• git can be complicated, but itis not inherently so.

• Forget cats, there is more thanone way to skin every type ofanimal in git.

• Teams should find theapproach that works best forthem.

• http://nvie.com/posts/a-successful-git-branching-model

Roy H. Stogner, John W. Peterson Distributed Development Feb 26, 2013 21 / 31

Page 22: libMesh: Lessons in Distributed Collaborative Design and ...roystgnr/libmesh_cse13_talk-talk.pdf · libMesh: Lessons in Distributed Collaborative Design and Development Roy H. Stogner1

Development and Testing

Outline

1 Introduction

2 Collaboration Strategies

3 API Development

4 Source Code ControlAttention Deficit DevelopmentLinear and Nonlinear History

5 Development and Testing

6 Build Systems

Roy H. Stogner, John W. Peterson Distributed Development Feb 26, 2013 22 / 31

Page 23: libMesh: Lessons in Distributed Collaborative Design and ...roystgnr/libmesh_cse13_talk-talk.pdf · libMesh: Lessons in Distributed Collaborative Design and Development Roy H. Stogner1

Development and Testing

Software Tracking

• Trac, Redmine - Wiki and issue tracking systems for softwaredevelopment projects

I http://trac.edgewall.org, http://www.redmine.orgI Interface to your VCS of choiceI Issue tracking (aka tickets) can reference commits and vice versaI Open source: (BSD, GPL2)

• Bitten, BuildBot - Continuous IntegrationI http://bitten.edgewall.org, http://trac.buildbot.netI Build recipes in XML, Python formatsI Can send build failure notifications directly to relevant parties

Roy H. Stogner, John W. Peterson Distributed Development Feb 26, 2013 23 / 31

Page 24: libMesh: Lessons in Distributed Collaborative Design and ...roystgnr/libmesh_cse13_talk-talk.pdf · libMesh: Lessons in Distributed Collaborative Design and Development Roy H. Stogner1

Development and Testing

Issue tracking (tickets)

Roy H. Stogner, John W. Peterson Distributed Development Feb 26, 2013 24 / 31

Page 25: libMesh: Lessons in Distributed Collaborative Design and ...roystgnr/libmesh_cse13_talk-talk.pdf · libMesh: Lessons in Distributed Collaborative Design and Development Roy H. Stogner1

Development and Testing

Issue tracking (tickets)

Roy H. Stogner, John W. Peterson Distributed Development Feb 26, 2013 25 / 31

Page 26: libMesh: Lessons in Distributed Collaborative Design and ...roystgnr/libmesh_cse13_talk-talk.pdf · libMesh: Lessons in Distributed Collaborative Design and Development Roy H. Stogner1

Development and Testing

Build Status

Roy H. Stogner, John W. Peterson Distributed Development Feb 26, 2013 26 / 31

Page 27: libMesh: Lessons in Distributed Collaborative Design and ...roystgnr/libmesh_cse13_talk-talk.pdf · libMesh: Lessons in Distributed Collaborative Design and Development Roy H. Stogner1

Development and Testing

Regression Testing

Roy H. Stogner, John W. Peterson Distributed Development Feb 26, 2013 27 / 31

Page 28: libMesh: Lessons in Distributed Collaborative Design and ...roystgnr/libmesh_cse13_talk-talk.pdf · libMesh: Lessons in Distributed Collaborative Design and Development Roy H. Stogner1

Development and Testing

Diagnosing Failed Builds

Roy H. Stogner, John W. Peterson Distributed Development Feb 26, 2013 28 / 31

Page 29: libMesh: Lessons in Distributed Collaborative Design and ...roystgnr/libmesh_cse13_talk-talk.pdf · libMesh: Lessons in Distributed Collaborative Design and Development Roy H. Stogner1

Build Systems

Outline

1 Introduction

2 Collaboration Strategies

3 API Development

4 Source Code ControlAttention Deficit DevelopmentLinear and Nonlinear History

5 Development and Testing

6 Build Systems

Roy H. Stogner, John W. Peterson Distributed Development Feb 26, 2013 29 / 31

Page 30: libMesh: Lessons in Distributed Collaborative Design and ...roystgnr/libmesh_cse13_talk-talk.pdf · libMesh: Lessons in Distributed Collaborative Design and Development Roy H. Stogner1

Build Systems

Autotools, Pros and Cons

Autoconf• Manages feature selection

I 50+ --enable-foo options

• Portability tests, workarounds

• POSIX shell dependence

Libtool• Easily used via automake

• Broader shared library support

• DLL management in install

• More difficult in-place debugging

Automake• dist, check, install

targets

• Out-of-source builds

• Standardizedconventions

• More difficult METHODsupport

• “bootstrap” processI Do users have

autotools?I Custom scripts for

libMesh

Roy H. Stogner, John W. Peterson Distributed Development Feb 26, 2013 30 / 31

Page 31: libMesh: Lessons in Distributed Collaborative Design and ...roystgnr/libmesh_cse13_talk-talk.pdf · libMesh: Lessons in Distributed Collaborative Design and Development Roy H. Stogner1

Build Systems

Questions?

Roy H. Stogner, John W. Peterson Distributed Development Feb 26, 2013 31 / 31