Sociotechnical production systems for software in science James Howison and Jim Herbsleb Institute...
If you can't read please download the document
Sociotechnical production systems for software in science James Howison and Jim Herbsleb Institute for Software Research School of Computer Science Carnegie
Sociotechnical production systems for software in science James
Howison and Jim Herbsleb Institute for Software Research School of
Computer Science Carnegie Mellon University School of Information
University of Texas at Austin
http://james.howison.name/pubs/HowisonHerbsleb2011SciSoftIncentives.pdf
Slide 2
How does a a cubic km of ice become a scientific paper?
Slide 3
First find some ice Image Credit: NASA
Slide 4
Build a big drill Image Credit: IceCube
Slide 5
and some Digital Optical Modules Image Credit: IceCube
Slide 6
Combine Image Credit: IceCube
Slide 7
Collect and filter data Image Credit: IceCube
Slide 8
Store and analyze it Image Credit:
http://www.flickr.com/photos/theplanetdotcom
Slide 9
Simulate light in ice Photo credit:
http://www.flickr.com/photos/rainman_yukky/
Slide 10
Simulate Atmosphere Image Credit: NASA
Slide 11
Model
Slide 12
Analyze
Slide 13
Plots
Slide 14
Publish
Slide 15
Software is everywhere
Slide 16
Enhancing reproducibility and correctness Saving money Driving
innovation Coalescing into widely used software platforms All
linked to software as information artifact: Re-playable Re-useable
Extendable A appealing vision of software
Slide 17
Yet software also has constraints Maintenance (avoiding bit
rot) Software must be maintained (synchronization work Kept in sync
with complements and dependencies Coordinated Rapid development and
changes can lead to breakdown Path dependencies Easy to start, hard
to architect for widespread use
Slide 18
How to achieve the Software Vision? Better technologies? Better
engineering methods? Leadership/Norms/Ethics? Policy? Rewards?
Slide 19
A sociotechnical understanding Understand software work in
existing institutions of science Specific Research Questions: What
software is used? How created and maintains it? What incentives
drive its creation? Why is it trusted?
Slide 20
Method: Data Route into complex practice Chose paper as unit of
analysis: Focal Paper Trace back from paper to work that produced
it Semi-structured interviews Supported by artifacts (e.g.,
paper/methods and materials) Elicit workflow, focus on software
work Identify software authors/sources, and seek introductions
Qualitative analysis Phenomenological exhaustion
Slide 21
Case 1: STAR Image Credit: RHIC
Slide 22
Our focal paper
Slide 23
Workflow
Slide 24
Software Production 1.Employed Core Software development
Professional software developers ROOT4STAR framework 2.Core
simulation code Scientists undertaking service work 3.Analysis code
to get the plots Locally written, frozen at publication
Slide 25
Case 3: Bioinformatic microbiology Image Credit:
http://www.flickr.com/photos/grytr
Slide 26
Studying the nitrogen cycle Image Credit: Focal Paper
Slide 27
A field revolutionized by software
Slide 28
Personal software infrastructure Power user scripts Personal
competitive advantage that is something that most biologists cant
do. period. Share methods but not personal infrastructure code or
actively support others Methods and materials section should
provide enough information, if not hell fix it. But not going to do
their homework for them
Slide 29
Publishing on software Tools potentially useful to others
described in separate publications, Software pubs Ambivalence: Can
you make a career out of this? Definitely But: hes known for his
software rather than his science hes known for facilitating science
rather than and some people have that reputation Advise a student
to do this? Yes, but if you happen to get a publication out of it
and it becomes a tool thats widely used, then great, thats
fantastic, better props for you but theres a danger Tool developers
are greatly under-appreciated
Slide 30
Algorithm people Self-described member of the algorithm people
as distinguished from biologists Muscle: biology == strcmp() Builds
from scratch (avoid tricky dependencies) Obvious that they dont
collaborate Credit accrues to the original publications Little
credit in perceived incremental improvements Politics of
improvement acceptance at the mercy of Competition is appropriate
and productive
Slide 31
Software Production systems Practice that is similar on four
aspects: 1.Incentives for the work 2.The type of artifacts produced
3.The way it is organized 4.The logic of correctness
Slide 32
Context: Academic reputation system
Slide 33
Software as support
Slide 34
Collaboration service-work
Slide 35
Academic credit: Incidental software
Slide 36
Academic credit: Parallel software practice
Slide 37
Systemic threats to software vision The type of software work
needed to realize the cyberinfrastructure vision is poorly
motivated Invisible work (Star and Ruhlender) Especially, little
incentive to collaborate Project owned by initial creators Initial
publications receive citations Extension dominated by
fork-and-rename
Slide 38
Academic reputation and integration James Howison and Jim
Herbsleb (2013) Sharing the spoils: incentives and integration in
scientific software production. ACM CSCW
Slide 39
Where to for science policy? Exhortations? Training? Forcing
open source through funding lever? Risk of substituting logics of
correctness Kleenex code as open source? Risk of undermining
appropriate competition Turn scientists into open source community
managers? When there is little reward for this work?
Slide 40
Scientific Software Network Map But, you know, imagine it as a
live, dynamic data set!
Slide 41
Techniques for measuring use Software that reports its own use
Instrumentation Analysis of traces in papers Mentions, citations
Characteristic artifacts Analysis of collections of software On
supercomputing resources (TACC, NICS) Through workflow systems
(Galaxy, Pegasus, Taverna)
Slide 42
Contact James Howison http://james.howison.name
[email protected] This material is based upon work
supported by the US National Science Foundation under Grant No.
#0943168.