39
Randall Wetzel, MD, MS Roby Khemani, MD, MS Dave Kale, MS Paul Vee, MBA Dan Crichton, MS Chris Mattmann, PhD Andrew Hart, MS Cameron Goodale, MBA Paul Zimdars MUCMD.ORG | SABAN RESEARCH CENTER | AUGUST 26-27 2011 1

Understanding the Meaningful Use of Open Source Software

Embed Size (px)

DESCRIPTION

My presentation from the Meaningful Use of Complex Medical Data Symposium.

Citation preview

Page 1: Understanding the Meaningful Use of Open Source Software

Randall Wetzel, MD, MSRoby Khemani, MD, MSDave Kale, MSPaul Vee, MBA

Dan Crichton, MSChris Mattmann, PhDAndrew Hart, MSCameron Goodale, MBAPaul Zimdars

MUCMD.ORG | SABAN RESEARCH CENTER | AUGUST 26-27 2011

1

Page 2: Understanding the Meaningful Use of Open Source Software

Areas for Open Source within the clinical context Strategies and Decision Points Discuss Open Source in the context of our current NLM

grant between CHLA and JPL

This is a shorter version of a longer, half day breakout on NASA Open Source that I led at the 9th NASA Earth Science Data System Working Group Meetings in October 2010

http://s.apache.org/anM Also check out NASA OSS slides that inspired this:

http://s.apache.org/pw

Does ‘open source’ provide clinical opportunities?

Can we devise a framework for understanding the concerns and challenges of using open source software in “big data” systems?

2

MUCMD.ORG | SABAN RESEARCH CENTER | AUGUST 26-27 2011

Page 3: Understanding the Meaningful Use of Open Source Software

MUCMD.ORG | SABAN RESEARCH CENTER | AUGUST 26-27 2011

33

MUCMD.ORG | SABAN RESEARCH CENTER | AUGUST 26-27 2011

• Apache Member involved in– OODT (VP, PMC), Tika (VP,PMC), Nutch (PMC),

Incubator (PMC), SIS (Mentor), Lucy (Mentor) and Gora (Champion), Hadoop MRUnit (Mentor)

• Architect/Development Lead at NASA JPL in Pasadena, CA

• Software Architecture/Engineering Prof at USC

Page 4: Understanding the Meaningful Use of Open Source Software

4

It’s here Chances are if you build

software nowadays, are youare using some type of open source software

Let me give you some context…

MUCMD.ORG | SABAN RESEARCH CENTER | AUGUST 26-27 2011

Page 5: Understanding the Meaningful Use of Open Source Software

MUCMD.ORG | SABAN RESEARCH CENTER | AUGUST 26-27 2011

5

Where is open source most useful?

Which area should produce open source software?

Page 6: Understanding the Meaningful Use of Open Source Software

MUCMD.ORG | SABAN RESEARCH CENTER | AUGUST 26-27 2011

6

Licensing› GPL(v2, v3?), LGPL(v?), BSD, MIT, ASLv2› Your own custom license approved by OSS

NLM OSS license? CHLA license?

› Copy-left versus Copy-right Redistribution

› Can you take open source product X and use it in your commercially interested software Y? If so, do you have to pay for it?

Should others pay for your open source product if they use it in their commercial application?

Open Source “Help Desk” Syndrome versus Community› Are you trying to simply make your open source software (releases)

available for distribution (aka help desk)?› Are you trying to get others to “buy in” to your open source

software?

Page 7: Understanding the Meaningful Use of Open Source Software

MUCMD.ORG | SABAN RESEARCH CENTER | AUGUST 26-27 2011

7

Intellectual Property› Who owns it?› How does the Open Source Software affect your IP?

Open Source Ecosystems› Where can you find the “killer app” you need?› Which communities are conducive for longevity?› How relevant are “generic” open source software communities to

Clinical “Big Data” Data Systems? Contributing

› Are you even allowed to contribute to a OSS community?› Can you do it on “company” time?› What’s required?› What’s the governance?

Responsiveness› How response is the OSS community to your projects’ needs?

Page 8: Understanding the Meaningful Use of Open Source Software

MUCMD.ORG | SABAN RESEARCH CENTER | AUGUST 26-27 2011

8

Help/Guidance › Is the OSS community/project alive?› How can you tell whether the project is alive?› How can you “follow the rainbow” to the OSS pot of gold?

Interaction› What are the best practices for interacting with OSS communities?

Implementation strategies› Insulation: how do you insulate your project from OSS change?› Configuration Management: what are the important CM issues when

using Open Source Software in your organization? Architectural strategies

› How to design your system to take advantage of OSS? Legal strategies

› How to avoid getting sued by some huge tech company?

Page 9: Understanding the Meaningful Use of Open Source Software

MUCMD.ORG | SABAN RESEARCH CENTER | AUGUST 26-27 2011

9

The aforementioned OSS concerns are cross cutting against the whole clinical enterprise!

Page 10: Understanding the Meaningful Use of Open Source Software

MUCMD.ORG | SABAN RESEARCH CENTER | AUGUST 26-27 2011

10

Relates to: redistribution, intellectual property, contributing, legal strategies

There are tons of OSS approved licenses› What’s the difference between them?

The difference mostly has to do with › Commercialization› Redistribution› Attribution

Page 11: Understanding the Meaningful Use of Open Source Software

MUCMD.ORG | SABAN RESEARCH CENTER | AUGUST 26-27 2011

11

So you’ve built some awesome piece of clinical software› And you’re wondering› What are my options for distributing it to

Other PICUs? Other Grant-funded projects?

› Any other collaborators? Question: what license are you going to choose?

› Hopefully one that supports redistribution under your own terms!

How to redistribute the software?› Requires infrastructure

Who’s going to set it up?

Page 12: Understanding the Meaningful Use of Open Source Software

MUCMD.ORG | SABAN RESEARCH CENTER | AUGUST 26-27 2011

12

Should you organization stand up its own? Where should you go?

Page 13: Understanding the Meaningful Use of Open Source Software

MUCMD.ORG | SABAN RESEARCH CENTER | AUGUST 26-27 2011

13

Start outwith Incubation

Grow community

Make releases

Gain interest

Diversify When the project is ready, graduate into

› Top-Level Project (TLP)› Sub-project of TLP

Increasingly, Sub-projects are discouraged compared to TLPs

Page 14: Understanding the Meaningful Use of Open Source Software

MUCMD.ORG | SABAN RESEARCH CENTER | AUGUST 26-27 2011

14

• Apache is a meritocracy– You earn your keep and your

credentials• Start out as Contributor

– Patches, mailing list comments, etc.– No commit access

• Move onto Committer– Commit access, evolve the code

• PMC Members– Have binding VOTEs on releases/personnel

• Officer (VP, Project)– PMC Chair

• ASF Member– Have binding VOTE in the state of the foundation– Elect Board of Directors

• Director– Oversight of projects, foundation activities

Page 15: Understanding the Meaningful Use of Open Source Software

MUCMD.ORG | SABAN RESEARCH CENTER | AUGUST 26-27 2011

15

• Project Proposal– Accepted? Get going!

• No foundation-wide oversight• Tons of dormant projects with

no communities of interest• Goal is to host infrastructure and

host technologies• Goal is not to build communities• No foundation-wide rules or guidelines for committership or for

project management – Dealt with locally by the progenitor of the project– Can lead to BDFL (benevolent dictator for life) syndrome

• No foundation-wide license requirements– BSD, GPL(v2, v3), MIT, LGPL, etc all allowed

Page 16: Understanding the Meaningful Use of Open Source Software

MUCMD.ORG | SABAN RESEARCH CENTER | AUGUST 26-27 2011

16

FTR, there are tons of concerns here› Should the ecosystem impose license restrictions?

Only support license X or Y?› What are the redistribution policies?› What’s the community?› What’s the IP?› What’s the infrastructure support?› Is it a foundation with rules, or just free for all?

Elephant in the room: What is the exposure? How many people are going to community C and are going to see your software and perhaps want to use it, improve it, file bugs against it, file patches, etc.?

Where/how does this matter to you?› Standing up our own OSS ecosystem may make sense for large, coarse-

grained federations› A LOT more difficult to justify OTOH, for fine grained components, and

module reuse

Page 17: Understanding the Meaningful Use of Open Source Software

MUCMD.ORG | SABAN RESEARCH CENTER | AUGUST 26-27 2011

17

• How do you want to have your open source software project run in the open?– Do you expect folks to come to you and only file bugs?

• Then you and your team are the only ones who can fix them?• Then you and your team are the only ones who can release updates?• ”Help Desk” open source project• Examples: Sourceforge.net, Google Code, etc.

– Do you expect to grow a community of interest where volunteers actively engage in software development?• Are volunteers empowered to pick up a shovel and help dig the hole?• Can volunteers (including your own paid employees) file issues?• Do you want to give the community a stake/vote in the overall

process?• “Community Building” open source project• Examples: Apache Software Foundation, Eclipse Foundation, etc.

Page 18: Understanding the Meaningful Use of Open Source Software

MUCMD.ORG | SABAN RESEARCH CENTER | AUGUST 26-27 2011

18

• Working on common software– Measured not in terms of what center contributor works for, but in

terms of • Number of patches contributed (high quality)• Mailing list questions answered• # of releases made, or helped with• Tests written• Documentation added

– Take the politics out of it and just work on “core” common code of mutual interest

• Deciding on the right redistribution mechanism and license– Apache Software Foundation and ALv2 provide openness and ability

for center-local redistribution, commerciality and other decisions (for internal distributions and beyond)

Page 19: Understanding the Meaningful Use of Open Source Software

MUCMD.ORG | SABAN RESEARCH CENTER | AUGUST 26-27 2011

19

• Centers can similarly (if they so choose) have their own customizations/distributions of OSS – CHLA VPICU’s FooBar powered by Apache Hadoop– Seattle Children’s Hospital’s Baz built on Apache Pivot powered by Linux

• Example: NASA protects its marks through the New Technology Report process– Uncover marks– Uncover potential patents– License/etc., as appropriate– We see this less on the ground side than we do with the flight– Mostly NASA wide, but some center specifics (e.g., Caltech)

• Takeaway: the answer to “who owns it” is still “the Center” or “the project” who initiated the software development, with or without OSS. OSS may have its own restrictions/rules, so need to be wary of that (recall copyleft syndrome)

Page 20: Understanding the Meaningful Use of Open Source Software

MUCMD.ORG | SABAN RESEARCH CENTER | AUGUST 26-27 2011

20

OSS communities are built around “contributions”› But what does it mean to contribute?

Mailing list help/discussions› Yes those are contributions

Reporting/filing bugs and new feature improvements› Yes those are contributions

Writing documentation and user guides› Yes those are contributions

Volunteering to make releases› Yes those are contributions

Positively contributing towards the evolution of the project with your thoughts and ideas› Yes those are contributions

NOTE: What did I leave out?

Page 21: Understanding the Meaningful Use of Open Source Software

MUCMD.ORG | SABAN RESEARCH CENTER | AUGUST 26-27 2011

21

• Yes yes, code is important (but not the only contribution)• How should you go about your code contributing process to open

source?• Relates to: “Help Desk” versus Community• Do you want the members of your project to be the only folks who

can actual touch the source code of your OSS project?– Only bug reports?– How “open” are you?– How “open” do you want to be?

• Do you want to accept code contributions from the community?– Patches?– Allow community members to become “committers” (assist in the

development of the code?)– RTC versus CTR

Page 22: Understanding the Meaningful Use of Open Source Software

MUCMD.ORG | SABAN RESEARCH CENTER | AUGUST 26-27 2011

22

Measure of a “healthy” community Does your patch sit? Do your mailing list questions go unanswered? Is the project electing new “committers” (if it’s in community

mode)?

How should your group decide whether or not an OSS project you want to use is responsive?

How should your group decide how responsive it will be?

Heavily influenced by:› Community mode: “Help desk” versus “Community”

Page 23: Understanding the Meaningful Use of Open Source Software

MUCMD.ORG | SABAN RESEARCH CENTER | AUGUST 26-27 2011

23

• Using Open Source Software is different than producing software intended for distribution in open source– Either way: what’s your strategy for getting help?– It used to be: you can’t rely on OSS for any “real time” support– Not anymore: companies stood up “around” OSS technology, dedicated to providing

support• Apache Hadoop: Cloudera, Hortonworks, EMC, Netapp, MapR• Apache Solr/Lucene: Lucid Imagination• Apache Cassandra: Riptano• Apache Maven: Sonatype• …others?

• How good is the documentation for the OSS Project?– Healthy wiki?– Healthy “cook books”?– Is the documentation baked in or separate?– Javadocs (or equivalent) on all public methods?– Use cases?

Page 24: Understanding the Meaningful Use of Open Source Software

MUCMD.ORG | SABAN RESEARCH CENTER | AUGUST 26-27 2011

24

Build expertise and intellectual capital in existing open source technology› If you are pulling it in and using it› Involves: “Community” model, but also works in “Help Desk”› Involves: active participation› Involves: friendly license (especially if you want to redistribute)

If you are building your own technology that you want to put into open source› Others may “help” you› Solve challenges at ZERO cost to you

(well not ZERO but minimal)› Free cycles of interest

Page 25: Understanding the Meaningful Use of Open Source Software

MUCMD.ORG | SABAN RESEARCH CENTER | AUGUST 26-27 2011

25

Case Study: Mailing lists communications*

* Taken from: http://wiki.apache.org/nutch/Becoming_A_Nutch_Developer

Page 26: Understanding the Meaningful Use of Open Source Software

MUCMD.ORG | SABAN RESEARCH CENTER | AUGUST 26-27 2011

26

Mailing list interaction very important› Most immediate feedback you get when contributing to open source, and also when creating

your own open source project› Is the biggest thing that turns people away to start out with› “who are these nerds and why don’t they have any manners?”› It’s all about “learning” their “language”

Sure, the cost may be high in terms of ego and attitude, but the reward (free work!) is well worth it

Following and understanding the practices of the OSS project are hugely important› Know the license for an OSS product› Know whether they are “community” versus “help desk”› Know its history› Read its documentation› Be informed!

Danger: thinking the OSS community is your personal helpdesk (or vice versa, you are its)

Page 27: Understanding the Meaningful Use of Open Source Software

MUCMD.ORG | SABAN RESEARCH CENTER | AUGUST 26-27 2011

27

• How to design for open source?• Extension points

– Places in your code for “plug ins”– Plugins should be well documented

• Public facing “javadoc” (or similar)• “Cookbooks” for integrating• Wiki pages• Baked in documentation

• Use of shared component marketplace– Many times PL specific– PyPI – Python– Maven Central/Ibiblio – Java– CPAN – Perl– STL, GNU, Autoconf, Make, etc. – C/C++– YMMV, some are

• Webified, have package metadata, searchable, etc.

Page 28: Understanding the Meaningful Use of Open Source Software

MUCMD.ORG | SABAN RESEARCH CENTER | AUGUST 26-27 2011

28

• Don’t expect to use Haskel and have a huge OSS community there to assist you

• Some modern active OSS languages– Python– Perl– Ruby– Java

• Don’t just be a consumer, be a producer– Pick up a shovel and help dig the hole– Instead of overlooking the OSS

community and telling them how the hole should be dug to meetyour needs

– Will help you get needed OSSfeedback for your technology

Page 29: Understanding the Meaningful Use of Open Source Software

MUCMD.ORG | SABAN RESEARCH CENTER | AUGUST 26-27 2011

29

Understand patent law› Or pay people to understand it for you

Understand OSS licenses and what you sign up for when you use relevant technologies and their licenses

Look for open APIs and open specs› Understand what “open” means

Vendor lock in› Try to avoid it through the use of all the techniques of OSS

which we’ve discussed Engage the community and understand what’s “really”

free an what isn’t Get legal’s buy-in

› Don’t insulate them

Page 30: Understanding the Meaningful Use of Open Source Software

MUCMD.ORG | SABAN RESEARCH CENTER | AUGUST 26-27 2011

30

Understand “marks”› Trademarks› Their implication to patents

What will your agency support in terms of licensing?› Center specific› Depends on IP, export control, ITAR, compliance, commercialization› If you make it through all of the above and the software is cleared for

release, you really are in the driver’s seat No agency wide guidelines Most or all communities allowed Most or all OSS approved licenses allowed

Is there a Contributor License Agreement (CLA) or Corporate Contributor License Agreement (CCLA) to sign?› What are its implications?

Page 31: Understanding the Meaningful Use of Open Source Software

MUCMD.ORG | SABAN RESEARCH CENTER | AUGUST 26-27 2011

31

“Something like the Apache foundation is the best place for released government software. A previous attempt at release and public distribution via a private company was a truly dismal failure. OpenPBS (portable batch system) is supposed to be available to anyone that asks. However when you do ask a sales rep strings you along for more than a month trying to sell you something that they can't actually assure you will fit your requirements (and is no longer under development) even when the free one is documented as doing so. It was a truly stupid waste of the salesperson's time and mine that would have exceeded the price of providing the file for download or sending by email by several orders of magnitude and generated a lot of ill will. I'll go as far as saying it was blatant false advertising using a government funded open source product to do a bait and switch to try to sell me an unmaintained product they picked up in a corporate take over. My experience appears to have been identical to that of many that attempted to obtain this government funded open source software that NASA had declared was available for anyone. Eventually due to this open source project becoming closed the project just had to fork and the compatible Torque batch system was developed by people that had actually get hold of the original OpenPBS.”

– Slashdot user comment

Page 32: Understanding the Meaningful Use of Open Source Software

MUCMD.ORG | SABAN RESEARCH CENTER | AUGUST 26-27 2011

32

• Originally funded by NASA to focus on– distributed science data system environments– science data generation – data capture, end-to-end– Distributed access to science data repositories

by the community

• Entered the Apache incubator program in January 2010

– First Apache project we are aware of originating from a NASA Lab

• Graduated to Apache Top Level Project in November 2010

• Used for a number of science data system activities in planetary, earth, biomedicine, astrophysics and pediatric intensive care• The basis of our joint CHLA-JPL NLM project

data management system

http://oodt.apache.org

Page 33: Understanding the Meaningful Use of Open Source Software

MUCMD.ORG | SABAN RESEARCH CENTER | AUGUST 26-27 2011

33

Division of LaborAvoid making one component the workhorse, configurable

Technology Independence Guard against unexpected changes in the technology landscape

Metadata as a first-class citizenDescriptions of resources come in handy

Separation of software and data modelsAllow each to evolve independently

Modular, domain-agnostic Pick and choose from adaptable components with defined interfaces

Page 34: Understanding the Meaningful Use of Open Source Software

MUCMD.ORG | SABAN RESEARCH CENTER | AUGUST 26-27 2011

34

Software Components› Catalog/Archive Services – data mgmt/workflow› Product Services – access/transformation of data› Profile Services – discovery of resources/data› Query Services – query federation› Web Grid – web-based messaging to connect services

Core Models› Common XML representations as an interface to all

components› Representation of distributed resources

Domain Models› These are specializations that sit on top of OODT

Page 35: Understanding the Meaningful Use of Open Source Software

MUCMD.ORG | SABAN RESEARCH CENTER | AUGUST 26-27 2011

35

Com

mon

Spec

ific

UniqueRqmts.

More Capabilities

OODT Core Framework• Core Components/Models• Process Control Framework

Science/Mission/Project-Specific• OODT Extension• Discipline Models• Mission-Specific Services

New capabilities/improvements

Page 36: Understanding the Meaningful Use of Open Source Software

MUCMD.ORG | SABAN RESEARCH CENTER | AUGUST 26-27 2011

36

Code Sharing/Tech Transfer› Could rapidly start

constructing servicesand extending the core OODT model

› Development sharingbetween institutionswithout a lot of paperwork and hassle

Contributions back into OODT from CHLA› Customizations from

CHLA can be leveragedby other OODT projects including NASA missions, and NIH/NCI Early Detection Research Network

Allowed focus on the architecture, rather than the implementation› See Andrew Hart’s talk on Saturday

Page 37: Understanding the Meaningful Use of Open Source Software

MUCMD.ORG | SABAN RESEARCH CENTER | AUGUST 26-27 2011

37

Open source is critical to the strategy of the organization

A great method of sustainability and longevity of software and community beyond organizational boundaries

Different licenses, communities, development practices› Know what the differences mean

Open Source/OODT have allowed great progress in our collaboration between JPL and CHLA

Page 38: Understanding the Meaningful Use of Open Source Software

MUCMD.ORG | SABAN RESEARCH CENTER | AUGUST 26-27 2011

38

Any questions?

THANK YOU!› [email protected] › @chrismattmann on Twitter

Page 39: Understanding the Meaningful Use of Open Source Software

39

MUCMD.ORG | SABAN RESEARCH CENTER | AUGUST 26-27 2011

Phone:323.361.2557

Email:[email protected]

Address:4650 Sunset Blvd. MS#12 Los Angeles, CA 90027

Web:www.vpicu.orgwww.mucmd.org

We will create a common information space for the international community of care givers providing critical care for children. Every critically ill child will have access to the Virtual PICU which will provide the essential information required to optimize their outcome.