Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Sustainable development of research software
Jorg Fehr, Christian Himpe, Stephan Rave, Jens Saak
90th annual meeting of GAMMMS-3: Research software and -data:How to ensure replicability, reproducibility, and reusabilityVienna 2019
Supported by:
“Sustainability of research software” callpyMOR — Sustainable Software for Model Order Reduction
Introduction: Motivation
Observation (of an inadequacy):
In-silicio experiments precede classic experimentation across thesciences:
Chemistry: Reactions are design on computers prior to verification inwet-labs.
Control Engineering: Controls are adapted in elaborate computerexperiments before getting applied in the real world
Material Science: New materials emerge from computer-basedoptimization
Still, there is no standard defining how to present, document andpublish computer-based experiments.
Also, there is no standard defining how to preserve, code and dataused in those experiments.
Jens Saak, [email protected] Sustainable development of research software 2/16
Introduction: Motivation
Observation (of a chance):
Science builds on previous findings
in theory — using e.g. theoremsand in practice — using established methods
Often, the first step of a new scientific endeavor is the reproduction ofprevious results
Building upon computational investigations is incredibly easy
software is easy to share and to modifyhardware is easy to replace
Jens Saak, [email protected] Sustainable development of research software 3/16
Introduction: Motivation
Observation (of a chance):
Science builds on previous findings
in theory — using e.g. theoremsand in practice — using established methods
Often, the first step of a new scientific endeavor is the reproduction ofprevious results
Building upon computational investigations could be incredibly easy
software is easy to share and to modifyhardware is easy to replace
Jens Saak, [email protected] Sustainable development of research software 3/16
Introduction: Our Aim
Improve Computer-Based Experiments (CBEx):
Create problem-awareness and
Ensure scientificity and progress
Define terminology
Establish best-practices
Formulate discipline-agnostic practical guidelines
Improve availability and quality of research software
Jens Saak, [email protected] Sustainable development of research software 4/16
Introduction: Our Aim
Improve Computer-Based Experiments (CBEx):
Create problem-awareness and
Ensure scientificity and progress
Define terminology
Establish best-practices
Formulate discipline-agnostic practical guidelines
Improve availability and quality of research software
Jens Saak, [email protected] Sustainable development of research software 4/16
Introduction: Our Aim
Improve Computer-Based Experiments (CBEx):
Create problem-awareness and
Ensure scientificity and progress
Define terminology
Establish best-practices
Formulate discipline-agnostic practical guidelines
Improve availability and quality of research software
Jens Saak, [email protected] Sustainable development of research software 4/16
Outline
1. IntroductionMotivationOur Aim
2. RRR to FAIRReplicabilityReproducibilityReusabilityThe Road to Sustainability
3. Proposed Development PracticesSmall ProjectLarge Project
Jens Saak, [email protected] Sustainable development of research software 5/16
RRR to FAIR: Replicabilityhttps://doi.org/bsb2 [Fehr/Heiland/Himpe/S. 16]
Definition
The attribute Replicability describes the ability to repeat a CBEx and tocome to the same (in a numerical sense) results. Sometimes the equivalentterm Repeatability is used for this experimental property.
Replicability is a basic requirement of reliable software as well as of itsresult as it shows a certain robustness of the procedure against
statistical influences
and bias of the observer.
Also, only replicable CBEx can serve as a benchmark to which newmethods can be compared, cf. [Vitek & Kalibera ’11].
Jens Saak, [email protected] Sustainable development of research software 6/16
RRR to FAIR: Reproducibilityhttps://doi.org/bsb2 [Fehr/Heiland/Himpe/S. 16]
Definition
Reproducibility of a CBEx means that it can be repeated by a differentresearcher in a different computer environment.
This is an adaption of the general concept of Reproducibility
that is key in any science that relies on experiments,
that is a subject in the theory of science, and
which absence in a significant fraction of publications in manyresearch areas has shaped the term Reproducibility crisis in recentyears [Marcus ’13]; cf. also[Collberg, Proebsting, & Warren ’04] on Reproducibility incomputer science.
Jens Saak, [email protected] Sustainable development of research software 7/16
RRR to FAIR: Reusabilityhttps://doi.org/bsb2 [Fehr/Heiland/Himpe/S. 16]
Definition
In the sphere of CBEx, Reusability refers to the possibility to reuse thesoftware or parts thereof for different purposes, in different environments,and by researchers other than the original authors.
In particular, Reusability enables the utilization of the test setup orparts of it for other experiments or related applications.
Although theoretically, any bit of a software can be reused for differentpurposes, here, Reusability applies only for reproducible parts.
Jens Saak, [email protected] Sustainable development of research software 8/16
The Road to SustainabilitySummary
Replicability ← This is a sanity check
Reproducibility ← This makes it science
Reusability ← This is the main step towardssustainability
Replicability Reproducibility Reusability Sustainability
Sustainable software is:
Findable, Accessible, Interoperable, Reusable
Jens Saak, [email protected] Sustainable development of research software 9/16
The Road to SustainabilitySummary
Replicability ← This is a sanity check
Reproducibility ← This makes it science
Reusability ← This is the main step towardssustainability
Replicability Reproducibility Reusability Sustainability
Sustainable software is:
Findable, Accessible, Interoperable, Reusable
Jens Saak, [email protected] Sustainable development of research software 9/16
The Road to SustainabilitySummary
Replicability ← This is a sanity check
Reproducibility ← This makes it science
Reusability ← This is the main step towardssustainability
Replicability Reproducibility Reusability Sustainability
Sustainable software is:
Findable, Accessible, Interoperable, Reusable
Jens Saak, [email protected] Sustainable development of research software 9/16
The Road to SustainabilitySummary
Replicability ← This is a sanity check
Reproducibility ← This makes it science
Reusability ← This is the main step towardssustainability
Replicability Reproducibility Reusability Sustainability
Sustainable software is:
Findable, Accessible, Interoperable, Reusable
Jens Saak, [email protected] Sustainable development of research software 9/16
Proposed Development Practices
small project
← often single developer and user
paper code, thesis project code
large project
← separate developer and user groups
groups in-house tool, community code, . . .
Jens Saak, [email protected] Sustainable development of research software 10/16
Proposed Development Practices
small project ← often single developer and user
paper code, thesis project code
large project
← separate developer and user groups
groups in-house tool, community code, . . .
Jens Saak, [email protected] Sustainable development of research software 10/16
Proposed Development Practices
small project ← often single developer and user
paper code, thesis project code
large project ← separate developer and user groups
groups in-house tool, community code, . . .
Jens Saak, [email protected] Sustainable development of research software 10/16
Small Project: Requirements
Code availability(recoverable from central institute repository)
Working example(s)(easier handover, usable for testing)
Code ownership(institute? supervisor? developer?)
Execution environment(documentation of soft- and hardware for compilation and execution)
Minimal documentation(README)
Jens Saak, [email protected] Sustainable development of research software 11/16
Small Project: Recommendations
Public release(Zenodo, Figshare, Run My Code, . . . )
Version control(track changes, named revisions, BACKUP!)
Basic code cleanup(obscure constants, dead code, hard-paths)
Reproducible execution environment(virtual machine, docker, . . . )
Integration into larger project(e.g. in-house or community code library)
Jens Saak, [email protected] Sustainable development of research software 12/16
Large Project: Requirements
Software license(license compatibility?)
Code ownership of contributions(re-licensing, availability of copyright holders, . . . )
Access to project resources(website, code repo, mailing list, support desk,. . . )(developer hierarchy, responsibilities)
Development in branches(stable master, management of branches, . . . )
Changelog(compressed history for smooth handover)
Jens Saak, [email protected] Sustainable development of research software 13/16
Large Project: Recommendations
Code maintainability(Code reviews, automatic testing and deployment (CI))
Code of Conduct(handover guidelines, new and leaving maintainers, . . . )
Contribution Policy(who? how? required skills?)
Citation Policy(Do developers/authors get the credits?)
Jens Saak, [email protected] Sustainable development of research software 14/16
pyMOR School
October 7-11, 2019MPI Magdeburghttps://school.pymor.org
Open For Discussions
Jorg Fehr
Stephan Rave
Christian Himpe
Jens Saak
Talk to us @ coffee breaks!
Many Thanks!
Questions?Remarks?
Suggestions?
Jens Saak, [email protected] Sustainable development of research software 16/16
Open For Discussions
Jorg Fehr
Stephan Rave
Christian Himpe
Jens Saak
Talk to us @ coffee breaks!
Many Thanks!
Questions?Remarks?
Suggestions?
Jens Saak, [email protected] Sustainable development of research software 16/16
Related Work
Software deposit guidance for researchers(The Software Sustainability Institute)
Recommendations on the development, use and provision ofResearch Software(Alliance of German Science Organizations)
Criteria fo Software Self-Assessment(INRIA Evaluation Committee)
Open Source Guides(GitHub and friends)
. . .
Jens Saak, [email protected] Sustainable development of research software 17/16
Large Project[Fehr/Heiland/Himpe/S. 16] https://doi.org/bsb2
• ReplicabilityRequired: Basic Documentation
Recommended: Automation & Testing
• ReproducibilityRequired: Extensive Documentation
Recommended: Availability
• ReusabilityRequired: Accessibility
Recommended: Software Management,Modularity & Licensing
Jens Saak, [email protected] Sustainable development of research software 18/16
References I
J. Vitek, T. Kalibera, Repeatability, reproducibility, and rigor in systems research,in: Proceedings of the 9th ACM International Conference on Embedded Software,2011, pp. 33–38.URL https://doi.org/10.1145/2038642.2038650
G. Marcus, The crisis in social psychology that isn’t (2013).URL https://www.newyorker.com/tech/elements/
the-crisis-in-social-psychology-that-isnt
C. Collberg, T. Proebsten, A. M. Warren, Repeatability and benefaction incomputer systems research, Tech. rep., University of Arizona, accessed: 2016-09-22(2014).URL http://reproducibility.cs.arizona.edu/v2/RepeatabilityTR.pdf
J. Fehr, J. Heiland, C. Himpe, J. Saak, Best practices for replicability,reproducibility and reusability of computer-based experiments exemplified by modelreduction software, AIMS Mathematics 1 (3) (2016) 261–281.
doi:10.3934/Math.2016.3.261.
Jens Saak, [email protected] Sustainable development of research software 19/16
References II
INRIA Evaluation Committee, Criteria for Software Self-Assessment, INRIA (Aug.2011).URL https://www.inria.fr/content/download/12702/427946/version/2/
file/softwarecriteria-ce_2011-08-01.pdf
The Software Sustainability Institute, Software deposit guidance for researchers,edited by Michael Jackson (Aug. 2018).URL https://softwaresaved.github.io/software-deposit-guidance/
GitHub and Friends, Open Source Guides, GitHub, accessed 2019-02-17.URL https://opensource.guide/
Research Software Working Group in the Priority Initiative Digital Information ofthe Alliance of German Science Organisations, Recommendations on thedevelopment, use and provision of research software, version 1.0 (Mar. 2018).
doi:10.5281/zenodo.1172988.
Jens Saak, [email protected] Sustainable development of research software 20/16
References III
M. D. Wilkinson, M. Dumontier, I. J. Aalbersberg, G. Appleton, M. Axton,A. Baak, N. Blomberg, J.-W. Boiten, J. Bonino da Silva Santos, P. E. Bourne,J. Bouwman, A. J. Brookes, T. Clark, M. Crosas, I. Dillo, O. Dumon, S. Edmunds,C. T. Evelo, R. Finkers, A. Gonzalez-Beltran, A. J. G. Gray, P. Groth, C. Goble,J. S. Grethe, J. Heringa, P. A. C. t Hoen, R. Hooft, T. Kuhn, R. Kok, J. Kok, S. J.Lusher, M. E. Martone, A. Mons, A. L. Packer, B. Persson, P. Rocca-Serra,M. Roos, R. van Schaik, S.-A. Sansone, E. Schultes, T. Sengstag, T. Slater,G. Strawn, M. A. Swertz, M. Thompson, J. van der Lei, E. van Mulligen,J. Velterop, A. Waagmeester, P. Wittenburg, K. Wolstencroft, J. Zhao, B. Mons,The FAIR Guiding Principles for scientific data management and stewardship,Science Data 3.
doi:10.1038/sdata.2016.18.
Jens Saak, [email protected] Sustainable development of research software 21/16