Click here to load reader
Upload
karen-cranston
View
526
Download
0
Embed Size (px)
DESCRIPTION
Summary of papers submitted to WSSSPE 2013 that address the role of communities in scientific software.
Citation preview
Workshop in Sustainable Software for Science: Practice and Experience
CommunitiesKaren Cranston!National Evolutionary Synthesis Center!@kcranstn
http://wssspe.researchcomputing.org.uk/!Workshop notes at http://bit.ly/wssspe13!These slides: http://www.slideshare.net/kcranstn/wssspe-cranston-community
Communities for sustainable software
Users Developers
software is useful & usable
features added; bugs fixed
discussion!help!
feedback
Extensibility❖ data management is a generic problem!❖ iRODS = highly customizable data management solution!
❖ many functions (data access, processing, provenance…)!❖ uses create policies for specific needs!
❖ over 25 science & engineering domains in user list!
Moore, Reagan M. Extensible Generic Data Management Software. http://arxiv.org/abs/1309.5372
Co-ordination of effort❖ high-energy physics relies computer modeling!
❖ lack of coordination between projects!
❖ propose:!
❖ develop teams of technical specialists !
❖ target many different architectures!
❖ common scripting language / APIs
Bruhwiler, David; Vay, Jean-Luc; Cameron G. R. Geddes; Koniges, Alice; Friedman, Alex; P. Grote, David (2013): White Paper on DOE-HEP Accelerator Modeling Science Activities. http://dx.doi.org/10.6084/m9.figshare.793816
User engagement
❖ Involve scientists in feedback and improvement!
❖ ‘Technology catalysts’: people with domain & technical skills
Reusability in Science: From Initial UserEngagement to Dissemination of Results
Ketan Maheshwari⇤, David Kelly⇤, Scott J. Krieder†, Justin M. Wozniak⇤,Daniel S. Katz‡, Mei Zhi-Gang§, Mainak Mookherjee¶
⇤MCS Division, Argonne National Laboratory†Department of Computer Science, Illinois Institute of Technology
‡Computation Institute, University of Chicago & Argonne National Laboratory§Nuclear Engineering Division, Argonne National Laboratory
¶Department of Earth and Atmospheric Sciences, Cornell University
Abstract—Effective use of parallel and distributed computing
in science depends upon multiple interdependent entities and
activities that form an ecosystem. Active engagement between
application users and technology catalysts is a crucial activity
that forms an integral part of this ecosystem. Technology catalysts
play a crucial role benefiting communities beyond a single user
group. An effective user-engagement, use and reuse of tools and
techniques has a broad impact on software sustainability. From
our experience, we sketch a life-cycle for user-engagement activity
in scientific computational environment and posit that application
level reusability promotes software sustainability. We describe
our experience in engaging two user groups from different
scientific domains reusing a common software and configuration
on different computational infrastructures.
Index Terms—Technology-catalyst, user-engagement, scientific
computation
I. INTRODUCTION
Domain scientists often have limited time to investigate thecapabilities that a large scale computing and data-handlinginfrastructure combined with a high performance softwareframework could bring to their scientific activities. Technologycatalysts help speed up the tedious process of organizingscientific computations such that they can be easily mappedonto computational infrastructure. However, this is an iterativeprocess and not free of pitfalls. The source of these pitfallscan be the scientific process itself or a mismatch in technicalrequirements mapping to computational infrastructures.
This presents a challenge: enabling effective reuse of exist-ing user-engagement patterns and related products for a newscientific user. If this challenge can be met, it could lead toa considerable acceleration in the process of conceptualizing,describing, defining, deploying, and executing scientific exper-iments. Reuse of data and software libraries is fairly common.Reuse of enabled applications across scientific domains isnot as common. A successful execution of such applicationsmight require tuning specific to the application requirements.However, with familiarization from previous engagements,much of this process can be expedited.
A widely reused system has a higher sustainability as acommunity supports its maintenance. Enabling applicationlevel reuse promotes sustainability of the entire ecosystem of
Fig. 1. Activities and transitions in user engagement cycle.
modern science. In this experience paper, we report on thefollowing:
1) Experience in scientific community engagement describ-ing activities performed at different levels in order tosupport scientific users with applications deployed ontonew, larger and faster systems.
2) A sketch and demonstration the elements of a successfulscientific application deployment cycle.
3) Enhancements of an enabling software framework basedon user feedback resulting in a software with improvedusability.
The remainder of this paper is organized as follows. In Sec-tion II, we describe the user engagement cycle that provides acontext to the human aspects of our work. In Section III, wedescribe the applications on which our experience is based andin which we apply software reuse techniques. In Section IV,we describe the hardware and software complexities that makesoftware maintenance strategies important. In Section VI, wesummarize some related work, and in Section VII we offerconcluding remarks.
II. USER ENGAGEMENT CYCLE
User engagement with technical catalysts is a complexsocial process that differs with respect to institutions, culture,and technical practices. A distillation of key points in theprocess is diagrammed in Figure Fig. 1. The cycle con-sists of four phases and transition activities between thesephases. It starts with user-catalyst communication involvingfamiliarizing the domain science and technology from the
arX
iv:1
309.
1813
v1 [
cs.S
E] 7
Sep
201
3
Maheshwari, K.; D. Kelly, S.J. Krieder, J.M. Wozniak, D.S. Katz, M. Zhi-Gang, M. Mookherjee. Reusability in Science: From Initial User Engagement to Dissemination of Results. http://arxiv.org/abs/1309.1813
❖ identify generic software pattern for running common software on different HPC architecture
Make it usable❖ Good software engineering processes
important!
❖ easier for people to use and contribute!
❖ Service-based business models!
❖ multiple communication channels, maintenance, training
Hanwell, Marcus; Perera, Amitha; Turner, Wes; O'Leary, Patrick; Osterdahl, Katie; Hoffman, Bill; Schroeder, Will (2013): Sustainable Software Ecosystems for Open Science. http://dx.doi.org/10.6084/m9.figshare.790756
Hackathons
❖ hands-on coding event with users, researcher-developers, software engineers!
❖ Community mailing list critical resource years later
Cranston, Karen; Vision, Todd; O'Meara, Brian; Lapp, Hilmar (2013): A grassroots approach to software sustainability. http://dx.doi.org/10.6084/m9.figshare.790739
❖ NESCent = (domain scientists) + (in-house informatics team)!
❖ Hackathon model:
Identify gaps❖ Tools & APIs for access to online data / resources!
❖ Direct collaboration / support for data providers!
❖ Workshops and training for users
Chamberlain, Scott; Hart, Edmund; Ram, Karthik; Boettiger, Carl (2013): rOpenSci - a collaborative effort to develop R-based tools for facilitating Open Science.!http://dx.doi.org/10.6084/m9.figshare.791569
Good software engineering
❖ More welcoming for developers!
❖ Easier for users to engage / test!
❖ Find common requirements across projects!
❖ Don’t neglect usability !
❖ Open-source software!
Community engagement
❖ Multiple communication channels!
❖ Direct interaction!
❖ People and centers with cross-over skills