39
Scientific Software: sustainability, skills & sociology Neil Chue Hong, [email protected] Director, Software Sustainability Institute US/IAEA Workshop on Software Sustainability for Safeguards Instrumentation, Vienna www.software.ac.uk

Scientific Software: Sustainability, Skills & Sociology

Embed Size (px)

Citation preview

Page 1: Scientific Software: Sustainability, Skills & Sociology

Scientific Software: sustainability, skills & sociologyNeil Chue Hong, [email protected], Software Sustainability InstituteUS/IAEA Workshop on Software Sustainability for Safeguards Instrumentation, Viennawww.software.ac.uk

Page 2: Scientific Software: Sustainability, Skills & Sociology

The Software Sustainability Institute

A national facility for cultivating world-class research through software• Better software enables better research• Software reaches boundaries in its

development cycle that prevent improvement, growth and adoption

• Providing the expertise and services needed to negotiate to the next stage

• Developing the policy and tools tosupport the community developing andusing research software

Supported by EPSRC Grant EP/H043160/1

www.software.ac.uk

Page 3: Scientific Software: Sustainability, Skills & Sociology

Anatomy of my talk

www.software.ac.uk

SOFT

WAR

E is

……

are IMPO

RTANT

everywhere

hard to define

long-lived

context

reasons

people

Page 4: Scientific Software: Sustainability, Skills & Sociology

Software is everywhere(even where you expect it)

www.software.ac.uk

Page 5: Scientific Software: Sustainability, Skills & Sociology

Factories

Services Cinema

Writing

Page 6: Scientific Software: Sustainability, Skills & Sociology

Software is pervasive

Page 7: Scientific Software: Sustainability, Skills & Sociology

Tamiflu binding to mutant influenza

A water-swap reaction coordinate for the calculation of absolute protein-ligand binding free energiesWoods CJ, Malaisree M, Hannongbua S, Mulholland AJJ. Chem. Phys. (2011) vol. 134, pp. 054114http://dx.doi.org/10.1063/1.3519057

Page 8: Scientific Software: Sustainability, Skills & Sociology

Favouring of disease risk alleles

Selection at pleiotropic loci underlies disease co-occurrence in human populations. Navarro, Haley, Karosas et al. Submitted to Nature Genetics

Page 9: Scientific Software: Sustainability, Skills & Sociology

Behind every great piece of science…#go through each SNP of interestfor(my $x = 0; $x < scalar @pos; $x++){ #and then each downstream SNP of interest for(my $y = $x+1; $y < scalar @pos; $y++) { #if SNPs within our chosen distance (500kb) and both present in the haplotypes file if((!($trait[$x] eq $trait[$y])) && (abs($pos[$x] - $pos[$y]) <= 500000) && (exists($legArrayPos{$pos[$x]})) && (exists($legArrayPos{$pos[$y]}))) { my $snp1ArrayPos = "”; my $snp2ArrayPos = "”; my $snp1All = "”; my $snp2All = "”;

#create output file for this SNP pair my $filename = "ConditionedResults2/$chr[$x].$pos[$x]-$pos[$y].EHH.GBR.2.txt”; print "$filename\n”; unless (-e $filename) { open(OUT, ">$filename");

#####################CHANGE THESE IF NOT FOCUSING ON SECOND SNP######################### my $start = $pos[$y]-500000; if ($start < 1) { $start = 1; } my $end = $pos[$y]+500000; if ($end > $chrLengths{$chr[$x]}) { $end = $chrLengths{$chr[$x]}; }

Page 10: Scientific Software: Sustainability, Skills & Sociology

Software is long-lived(and outlasts computational hardware)

www.software.ac.uk

Page 11: Scientific Software: Sustainability, Skills & Sociology

Architectural Dominance

www.software.ac.uk

Image courtesy PDES IncSlide from Sean Barker, BAE SYSTEMS, DPC Designed to Last

Page 12: Scientific Software: Sustainability, Skills & Sociology

13

Computational Chemistry - CASTEP

From the first implementation of a DFT algorithm to a completely new code to community supported software

• Individual• Group• Consortium• W/ industry• Community• Active

Software advances< hardware speedup http://www.castep.org/

www.software.ac.uk

Page 13: Scientific Software: Sustainability, Skills & Sociology

LOTAR: storing aeronautical models

Life of CAD System: 10 years

Time between CAD Versions: 6 months

Life of Product: 70 years +

time

Production

CAD Obsolete CAD Forgotten

Services

Legal Liability

Modifications

10 years 20 30 40 50 60

Spares

Image courtesy PDES IncSlide from Sean Barker, BAE SYSTEMS, DPC Designed to Last

www.software.ac.uk

Page 14: Scientific Software: Sustainability, Skills & Sociology

So we have to maintain it…

• “The modification of a software product after delivery to correct faults, to improve performance or other attributes, or to adapt the product to a modified environment” – IEEE defn.– Corrective maintenance: fixing faults– Adaptive maintenance: adapting to changes in

environment– Perfective maintenance: meeting new/different user

requirements– Preventative maintenance: increasing maintainability

www.software.ac.uk

Page 15: Scientific Software: Sustainability, Skills & Sociology

… because we cannot change this with process and practice alone …

• “Many of us have tried to discover ways to prevent code from becoming legacy. But … prevention is imperfect. Even the most disciplined development team, knowing the best principles, using the best patterns, and following the best practices will create messes from time to time. The rot still accumulates. It’s not enough to prevent the rot – you have to be able to reverse it.”

www.software.ac.uk

Page 16: Scientific Software: Sustainability, Skills & Sociology

… so we work with what we have

• Identify change points• Find test points• Break dependencies• Write tests• Make changes and refactor

Testing, infrastructure, documentation are key

www.software.ac.uk

Page 17: Scientific Software: Sustainability, Skills & Sociology

Software is hard to define(and thus hard to sustain)

www.software.ac.uk

Page 18: Scientific Software: Sustainability, Skills & Sociology

What do we sustain:- Workflow?- Software that runs workflow?- Software referenced by workflow?

Page 19: Scientific Software: Sustainability, Skills & Sociology

Novel reuse of public sector datahttp://www.mysociety.org

What do we sustain:- Map?- Software that creates map?

Page 20: Scientific Software: Sustainability, Skills & Sociology

21

Sustaining Function or FormWhat do we sustain:- Function?- Form?

Page 21: Scientific Software: Sustainability, Skills & Sociology

Context is important(otherwise all you have is an object)

www.software.ac.uk

Page 22: Scientific Software: Sustainability, Skills & Sociology

Comb badge, Museum of London

• Without context, objects have no meaning

What’s this item?

32x28mm, lead alloy, late Medieval 14-15th century

Page 23: Scientific Software: Sustainability, Skills & Sociology

What about repositories?

re pos i tor y⋅ ⋅ ⋅ ⋅

/noun/ [ri-poz-i-tawr-ee] • 1. a receptacle or place where things are

deposited, stored, or offered for sale.

• 2. a burial place; sepulchre.

www.software.ac.uk

Page 24: Scientific Software: Sustainability, Skills & Sociology

The Zombie Effect

• Software not always fully alive when you reanimate it!

• Complex set of dependencies– Significant Properties of Software– Purposes and benefits of

software preservation

http://www.jisc.ac.uk/media/documents/programmes/preservation/significantpropertiesofsoftware-final.doc

http://softwarepreservation.jiscinvolve.org/wp/

Page 25: Scientific Software: Sustainability, Skills & Sociology

Reasons are important(so you take the right approach)

www.software.ac.uk

Page 26: Scientific Software: Sustainability, Skills & Sociology

Why are you considering software sustainability?

Achieve legal compliance

Create heritage value

Enable continued access to data and services

Encourage software reuse

Purpose

www.software.ac.uk

Page 27: Scientific Software: Sustainability, Skills & Sociology

How are you going to choose the right approach?

Preservation (techno-centric)

Emulation (data-centric)

Migration (functionality-centric)

Transition (process-centric)

Hibernation (knowledge-centric)

Approach

www.software.ac.uk

Page 28: Scientific Software: Sustainability, Skills & Sociology

Preservation vs sustainability

Image courtesy of RGB Kew – not for reuse

Image courtesy of London Permaculture under CC-by-nc-sa license

Preservation?

Sustainability?

www.software.ac.uk

Page 29: Scientific Software: Sustainability, Skills & Sociology

People are important(people are infrastructure too)

www.software.ac.uk

Page 30: Scientific Software: Sustainability, Skills & Sociology

Sustainable Communities

• Cohesion and Identity: Creating a community

• Tolerance and Diversity: Smart growth through collaboration

• Efficient use of resources: Leveraging infrastructure

• Adaptability to change: Governing sustainably

www.software.ac.uk

Page 31: Scientific Software: Sustainability, Skills & Sociology

34

Cultivate Contributors – R project

• Basics: Website, mailing list, code repository, issue resolution

• Remove barriers to participation, increase efficiency

• 1993: First public release; 2 devs• 1995: Code open sourced; 3 devs• 1996: r-testers list set up• 1997: lists split: r-announce, r-help,

r-devel; public CVS; 11 devs• 2000: CRAN split and mirror• 2001: BioConductor• 2003: Namespaces• 2005: I8n, L8n• 2007: R-Forge• Today: BioConductor (33 core devs),

R-Forge (532 projects, 1562 devs), CRAN (1400+ packages)

http://cran.r-project.org/doc/html/interface98-paper/paper_2.html

www.software.ac.uk

Page 32: Scientific Software: Sustainability, Skills & Sociology

We under-appreciate training

• Basic training for kitchen chef: 3-4 years

• Head chef: 10 years

• Basic training for s/w engineer: 3-4 years

• Architect: 10 years

Phot

o by

Zag

atBu

zz

• Training in S/W Dev in UG Physics: 140 hours• Training in S/W Dev in UG Geography: 0 hours

www.software.ac.uk

Page 33: Scientific Software: Sustainability, Skills & Sociology

Software Carpentry

• Lab skills for scientific computing– http://software-carpentry.org– International initiative to teach

basics of software engineering to researchers• The “why” more than

the “how”

– We ran 13 workshopsin 2013 to 600+ learners

Page 34: Scientific Software: Sustainability, Skills & Sociology

Incentives are important

www.software.ac.uk

Courtesy of James Howison and James HerbslebIncentives and Integration In Scientific Software Production

Rewrite by original team: address fragility

Fork to add specific functionalityMaintained separately

Optimised for hardware Facilitate hardware

sales

Exploit new techniques / architectures

Page 35: Scientific Software: Sustainability, Skills & Sociology

And money isn’t everything

www.software.ac.uk

Fund

ing

/ St

affing

Time

Next expt. running

ExperimentRunning

Analysis ofData

New experimentdesign starts

Maintenance of software to process data from

physics experiment

Page 36: Scientific Software: Sustainability, Skills & Sociology

So beware your bus factor

www.software.ac.uk

Page 37: Scientific Software: Sustainability, Skills & Sociology

Summary of my talk

www.software.ac.uk

SOFT

WAR

E is

……

are IMPO

RTANT

everywhere

hard to define

long-lived

context

reasons

people

Page 38: Scientific Software: Sustainability, Skills & Sociology

Take home messages

www.software.ac.uk

No-one sets out to write unsustainable software

Software sustainability is importantbecause it has to happen

People need the skills and incentivesto maintain software through its lifetime

Page 39: Scientific Software: Sustainability, Skills & Sociology

Work with us – www.software.ac.uk

www.software.ac.uk