35
D. Koop, CIS 602-01, Fall 2016 CIS 602-01: Computational Reproducibility Data Sharing Dr. David Koop

CIS 602-01: Computational Reproducibilitydkoop/cis602-2016fa/lectures/lecture...w.r.t. prev. quarter) Independent Gender diversity (Blau index) Tenure diversity (coeff. variation)

Embed Size (px)

Citation preview

Page 1: CIS 602-01: Computational Reproducibilitydkoop/cis602-2016fa/lectures/lecture...w.r.t. prev. quarter) Independent Gender diversity (Blau index) Tenure diversity (coeff. variation)

D. Koop, CIS 602-01, Fall 2016

CIS 602-01: Computational Reproducibility

Data Sharing

Dr. David Koop

Page 2: CIS 602-01: Computational Reproducibilitydkoop/cis602-2016fa/lectures/lecture...w.r.t. prev. quarter) Independent Gender diversity (Blau index) Tenure diversity (coeff. variation)

Assignment 1• http://www.cis.umassd.edu/~dkoop/cis602/assignment1.html • Must have a GitHub account (free) • Turn in a link to your GitHub repo to myCourses • Due Friday, October 7 • Uses shell scripts

- Use Terminal.app on Mac OS X, the GitHub Desktop shell on Windows, and your favorite console on Linux

- For Windows, you can get into bash by typing bash in PowerShell • Please continue let me know if you have questions or find bugs! • Fixing errors: git revert or git reset (--hard) • Merge: use a merge tool or consider (--ours/theirs)

2D. Koop, CIS 602-01, Fall 2016

Page 3: CIS 602-01: Computational Reproducibilitydkoop/cis602-2016fa/lectures/lecture...w.r.t. prev. quarter) Independent Gender diversity (Blau index) Tenure diversity (coeff. variation)

Time

release branches masterdevelop hotfixes

feature branches

Feature for future

release

Tag

1.0

Major feature for

next release

From this point on, “next release”

means the release after 1.0

Severe bug fixed for

production:hotfix 0.2

Bugfixes from rel. branch

may be continuously merged back into develop

Tag

0.1

Tag

0.2

Incorporate bugfix in develop

Only bugfixes!

Start of release

branch for1.0

Author: Vincent DriessenOriginal blog post: http://nvie.com/posts/a-succesful-git-branching-model

License: Creative Commons BY-SA

Git Flow

3D. Koop, CIS 602-01, Fall 2016

[V. Driessen, CC BY-SA, 2010]

Page 4: CIS 602-01: Computational Reproducibilitydkoop/cis602-2016fa/lectures/lecture...w.r.t. prev. quarter) Independent Gender diversity (Blau index) Tenure diversity (coeff. variation)

CR E AT E A B R A N CH

Create a branch in your project where you can safely experiment and

make changes.

O P EN A P U L L R EQ U E ST

Use a pull request to get feedback on your changes from people down the hall

or ten time zones away.

M ERG E A N D D EP LOY

Merge your changes into your master branch and

deploy your code.

A D D CO M M I T S D I S C U S S A N D R E V I E W

GitHub Flow

4D. Koop, CIS 602-01, Fall 2016

[GitHub Flow]

Page 5: CIS 602-01: Computational Reproducibilitydkoop/cis602-2016fa/lectures/lecture...w.r.t. prev. quarter) Independent Gender diversity (Blau index) Tenure diversity (coeff. variation)

GitHub Study• Interview Participants:

- (heavy, light) x (hobbyist, work (non-SW org), work (SW org) • Results:

5D. Koop, CIS 602-01, Fall 2016

Hobbyist Work use: non-SW org

Work use: SW org

Peripheral users

P2, P7, P10, P18, P23

P6, P12, P16, P24

P5, P17

Heavy users

P11, P15, P21, P22

P3, P4, P9, P14, P19

P1, P8, P13, P20

Table 1. Summary of interview participants.

Participants were asked to walk us through their last session on GitHub, describing how they interpreted information displayed on the site as they managed their projects, and interacted with other users’ projects. Remote participants shared their screen during the interview using Adobe Connect so we could ask specific questions about data on the site and users could demonstrate their activities on the site. Interviews lasted approximately 45 minutes to one hour. These interviews were then transcribed verbatim to support further analysis. The interviews, videos and field notes supported our analysis process.

Data analysis We applied a grounded approach to analyze the transparency related inferences in our interview responses [5]. We first identified instances of these types of inferences in five interview transcripts. For each example analyzed, we identified what information was made visible by the GitHub system, what inferences the participant was making based on that information, and the associated higher-level goal. We then conducted open coding on these responses, comparing each instance with previously examined examples and grouping examples that were conceptually similar. This process revealed categories of transparency related inferences and higher level behaviors these inferences supported. We used this first set of categories to code the remaining interviews, revealing additional categories and refining our original coding scheme to represent the dataset as a whole. We repeatedly discussed the codes and transcripts in a highly collaborative and iterative process. We continued this process until the interviews no longer revealed new behaviors not captured in our existing set of categories (theoretical saturation).

RESULTS Our analysis revealed that individuals made a rich set of inferences based on information on GitHub. These inferences were a function of four sets of visible cues (summarized in Table 2).

Recency, volume, and location of actions signaling commitment and interests As with other low-cost hosting sites, GitHub has a mix of projects that are little more than code dumps and serious projects that continue to receive attention and effort. There is also a mix of hobbyists who make occasional contributions and move on, and dedicated developers who provide project stewardship over the longer term. Our interviewees often used the recency and volume of activity

as a signal of commitment or investment at the individual and project level.

Visible information about other developers’ actions influenced perceptions of their commitment and general interests. Recent activity gave a sense of the level of investment in a project. The feed of developer actions across projects helped other developers infer their current interests. One respondent described following a friend to stay up to date on what he was up to through his commits (P16). The amount of commits to a single project signaled commitment or investment to that project, while the type of commits signaled interest in different aspects of the project.

Visible Cues

Social Inferences

Representative Quote

Recency and volume of activity

Interest and level of commitment

“this guy on Mongoid is just -- a machine, he just keeps cranking out code.” (P23)

Sequence of actions over time

Intention behind action

“Commits tell a story. Convey direction you are trying to go with the code … revealing what you want to do.” (P13)

Attention to artifacts and people

Importance to community

“The number of people watching a project or people interested in the project, obviously it's a better project than versus something that has no one else interested in it.” (P17)

Detailed information about an action

Personal relevance and impact

“If there was something [in the feed] that would preclude a feature that I would want it would give me a chance to add input to it.” (P4)

Table 2. Visible cues and the social inferences they generated.

Recent activity signaling project liveness and maintenance As with many open source hosting sites, dead and abandoned projects greatly outnumber live ones that people continue to contribute and pay attention to. It can be tedious to figure out which are which, yet it is important to do so, since one does not want to adopt or contribute to a dead or dying project. In GitHub, developers described getting a sense of how ‘live’ or active a project was by the amount of commit events showing up in their feed.

“Commit activity in the feeds shows that the project is alive, that people are still adding code.” (P16)

Users also relied on historical activity to make inferences about how well the project was managed and maintained. Lots of open pull requests indicated that an owner was not particularly conscientious in dealing with people external to the project, since each open pull request indicates an offer of code that is being ignored rather than accepted, rejected or commented upon (P11).

Sequence of actions conveying meaning Visible actions on artifacts carried meaning, often as a function of their sequence, or ordering with respect to other

[Dabbish et al., 2012]

Page 6: CIS 602-01: Computational Reproducibilitydkoop/cis602-2016fa/lectures/lecture...w.r.t. prev. quarter) Independent Gender diversity (Blau index) Tenure diversity (coeff. variation)

Social Inferences• Project management • Learning from others • Managing reputation and status

6D. Koop, CIS 602-01, Fall 2016

Page 7: CIS 602-01: Computational Reproducibilitydkoop/cis602-2016fa/lectures/lecture...w.r.t. prev. quarter) Independent Gender diversity (Blau index) Tenure diversity (coeff. variation)

GitHub Actions and Reproducibility• Do actions with respect to reproducibility signal anything? • Is there a social model for reproducibility? • What faults does GitHub have (e.g. gender studies) • How do those faults impact reproducibility concerns?

7D. Koop, CIS 602-01, Fall 2016

Page 8: CIS 602-01: Computational Reproducibilitydkoop/cis602-2016fa/lectures/lecture...w.r.t. prev. quarter) Independent Gender diversity (Blau index) Tenure diversity (coeff. variation)

WatchEvent

PushEvent

ForkEvent

.

.

.

CreateEvent

Events

Data about GitHub via its API

8D. Koop, CIS 602-01, Fall 2016

[G. Gousios, 2016]

Page 9: CIS 602-01: Computational Reproducibilitydkoop/cis602-2016fa/lectures/lecture...w.r.t. prev. quarter) Independent Gender diversity (Blau index) Tenure diversity (coeff. variation)

WatchEvent

PushEvent

ForkEvent

.

.

.

CreateEvent

{{ "type": "WatchEvent", "payload": {...}, "public": true, "repo": {...}, "created_at": "2012-05-28T12:42:30Z", "id": "1556481024", "actor": {"login": "Sarukhan"} }

Events

GitHub Event Information

9D. Koop, CIS 602-01, Fall 2016

[G. Gousios, 2016]

Page 10: CIS 602-01: Computational Reproducibilitydkoop/cis602-2016fa/lectures/lecture...w.r.t. prev. quarter) Independent Gender diversity (Blau index) Tenure diversity (coeff. variation)

repositories

users

organizations

issues

/users/:user

/user/repos

/repos/:user/:repo/issues

/orgs/:org

{ "type": "User", "public_gists": 10, "login": "gousiosg", "followers": 64, "name": "Georgios Gousios", "public_repos": 20, "created_at": ..., "id": 386172, "following": 16, }

{Entities

.

.

.

GitHub Entity Information

10D. Koop, CIS 602-01, Fall 2016

[G. Gousios, 2016]

Page 11: CIS 602-01: Computational Reproducibilitydkoop/cis602-2016fa/lectures/lecture...w.r.t. prev. quarter) Independent Gender diversity (Blau index) Tenure diversity (coeff. variation)

1000

10000

100000

2012 2013 2014 2015 2016Date

Num

ber o

f eve

nts

Event Type

CommitCommentEvent

FollowEvent

ForkEvent

IssueCommentEvent

IssuesEvent

MemberEvent

PullRequestEvent

PullRequestReviewCommentEvent

PushEvent

TeamAddEvent

WatchEvent

GHTorrent Stats

11D. Koop, CIS 602-01, Fall 2016

[G. Gousios, 2016]

Page 12: CIS 602-01: Computational Reproducibilitydkoop/cis602-2016fa/lectures/lecture...w.r.t. prev. quarter) Independent Gender diversity (Blau index) Tenure diversity (coeff. variation)

Which factors affect PR acceptance?

Do we know the submitter?

Can we handle the workload?

How ready is our project for PRs?

What does the PR look like?

Factors affecting pull request acceptance

12D. Koop, CIS 602-01, Fall 2016

[G. Gousios, 2016]

Page 13: CIS 602-01: Computational Reproducibilitydkoop/cis602-2016fa/lectures/lecture...w.r.t. prev. quarter) Independent Gender diversity (Blau index) Tenure diversity (coeff. variation)

Can we handle the workload?

Which factors affect the time to process PRs?

Do we know the submitter?

How ready is our project for PRs?

What does the PR look like?

Factors affecting time to process PRs

13D. Koop, CIS 602-01, Fall 2016

[G. Gousios, 2016]

Page 14: CIS 602-01: Computational Reproducibilitydkoop/cis602-2016fa/lectures/lecture...w.r.t. prev. quarter) Independent Gender diversity (Blau index) Tenure diversity (coeff. variation)

CHI’15, Seoul, South Korea April 23, 2015

@b_vasilescu @aserebrenik @vlfilkov@devanbu@baishakhir @MarkvandenBrand

Which is more effective?Diversity and GitHub

14D. Koop, CIS 602-01, Fall 2016

[B. Vasilescu et al., 2015]

Page 15: CIS 602-01: Computational Reproducibilitydkoop/cis602-2016fa/lectures/lecture...w.r.t. prev. quarter) Independent Gender diversity (Blau index) Tenure diversity (coeff. variation)

CHI’15, Seoul, South Korea April 23, 2015

@b_vasilescu @aserebrenik @vlfilkov@devanbu@baishakhir @MarkvandenBrand

Diversity !

Similarity attraction theory

People prefer working with others similar to them in terms of values, beliefs, and

attitudes [Byrne]

Social identity and social categorization theory

People categorize themselves into specific groups. Members of own group are treated

better than outsiders [Tajfel]

Due to greater perceived differences between groups than within groups, diversity can lead to confusion,

stress, and conflict [Horwitz & Horwitz]

Diversity Impediments

15D. Koop, CIS 602-01, Fall 2016

[B. Vasilescu et al., 2015]

Page 16: CIS 602-01: Computational Reproducibilitydkoop/cis602-2016fa/lectures/lecture...w.r.t. prev. quarter) Independent Gender diversity (Blau index) Tenure diversity (coeff. variation)

CHI’15, Seoul, South Korea April 23, 2015

@b_vasilescu @aserebrenik @vlfilkov@devanbu@baishakhir @MarkvandenBrand

Multicultural social networks promote creativity

[Harvard Business School]

Driver of internal innovation and business growth [Forbes]

Companies with diverse executive boards have higher earnings and

returns on equity [McKinsey]

Diverse problem solvers outperform high ability problem

solvers [Hong & Page]

Diversity "Diversity Benefits

16D. Koop, CIS 602-01, Fall 2016

[B. Vasilescu et al., 2015]

Page 17: CIS 602-01: Computational Reproducibilitydkoop/cis602-2016fa/lectures/lecture...w.r.t. prev. quarter) Independent Gender diversity (Blau index) Tenure diversity (coeff. variation)

CHI’15, Seoul, South Korea April 23, 2015

@b_vasilescu @aserebrenik @vlfilkov@devanbu@baishakhir @MarkvandenBrand

Gender diversity = mix women/men simplifying assumption:

gender is binary

Today: gender & tenure diversity in open source software (OSS) GitHub teams

Reports of active discrimination and sexism towards women [Nafus]

Women are <10% in OSS [Robles et al]

The “hacker” culture is male-dominated and unfriendly to women [Turkle]

Gender diversity in GitHub teams

17D. Koop, CIS 602-01, Fall 2016

[B. Vasilescu et al., 2015]

Page 18: CIS 602-01: Computational Reproducibilitydkoop/cis602-2016fa/lectures/lecture...w.r.t. prev. quarter) Independent Gender diversity (Blau index) Tenure diversity (coeff. variation)

CHI’15, Seoul, South Korea April 23, 2015

@b_vasilescu @aserebrenik @vlfilkov@devanbu@baishakhir @MarkvandenBrand

ResponseProductivity(#commits/quarter) Turnover

(fraction team new w.r.t. prev. quarter)

Independent

Gender diversity (Blau index)

Tenure diversity (coeff. variation)• project• overall coding

Controls

Team size Project ageTime Project activity

Mining

Sample4K projects

Study using GHTorrent data

18D. Koop, CIS 602-01, Fall 2016

[B. Vasilescu et al., 2015]

Page 19: CIS 602-01: Computational Reproducibilitydkoop/cis602-2016fa/lectures/lecture...w.r.t. prev. quarter) Independent Gender diversity (Blau index) Tenure diversity (coeff. variation)

CHI’15, Seoul, South Korea April 23, 2015

@b_vasilescu @aserebrenik @vlfilkov@devanbu@baishakhir @MarkvandenBrand

Productivity(#commits/quarter)

Team size Project ageOverall project

activity

+ + -

all team sizes+

mid-size & large teams

Gender diversity+

Forks

-Tenure diversity

ResultsStudy Results

19D. Koop, CIS 602-01, Fall 2016

[B. Vasilescu et al., 2015]

Page 20: CIS 602-01: Computational Reproducibilitydkoop/cis602-2016fa/lectures/lecture...w.r.t. prev. quarter) Independent Gender diversity (Blau index) Tenure diversity (coeff. variation)

Gitless Update

20D. Koop, CIS 602-01, Fall 2016

[xkcd, R. Munroe]

Page 21: CIS 602-01: Computational Reproducibilitydkoop/cis602-2016fa/lectures/lecture...w.r.t. prev. quarter) Independent Gender diversity (Blau index) Tenure diversity (coeff. variation)

Consist

ent *Complete *

Well D

ocumented*Easyt

oR

euse* *

Evaluated*

OOPSLA*

Artifact *

AEC

Purposes, Concepts, Misfits, and a Redesign of Git

Santiago Perez De Rosso Daniel JacksonComputer Science and Artificial Intelligence Lab

Massachusetts Institute of TechnologyCambridge, MA, USA

{sperezde, dnj}@csail.mit.edu

AbstractGit is a widely used version control system that is powerfulbut complicated. Its complexity may not be an inevitableconsequence of its power but rather evidence of flaws in itsdesign. To explore this hypothesis, we analyzed the designof Git using a theory that identifies concepts, purposes, andmisfits. Some well-known difficulties with Git are described,and explained as misfits in which underlying concepts failto meet their intended purpose. Based on this analysis, wedesigned a reworking of Git (called Gitless) that attempts toremedy these flaws.

To correlate misfits with issues reported by users, weconducted a study of Stack Overflow questions. And todetermine whether users experienced fewer complicationsusing Gitless in place of Git, we conducted a small user study.Results suggest our approach can be profitable in identifying,analyzing, and fixing design problems.

Categories and Subject Descriptors D.2.2 [Software Engi-neering]: Design Tools and Techniques; D.2.7 [SoftwareEngineering]: Distribution, Maintenance and Enhancement—Version Control

Keywords concepts; concept design; design; software de-sign; usability; version control; Git.

1. IntroductionExperiment This paper describes an experiment in softwaredesign. We took a popular software product that is both highlyregarded for its functionality, flexibility and performance, andyet is also frequently criticized for its apparent complexity,especially by less expert users.

First, we did an analysis of the product, in which we ap-plied some new design principles [16] in an attempt to iden-tify problematic aspects of the design, suggesting respects inwhich the design might be improved. Since any such analysisis likely to be influenced by subjective factors (not least ourown experiences using the product, and the particular con-texts in which we used it), we corroborated the analysis byexamining a large number of posts in a popular Q&A forum,to determine whether the issues we identified were in factaligned with those that troubled other users.

Second, we reworked the design to repair the deficienciesidentified by our analysis, and implemented the new design.To evaluate the redesign, we conducted a user study in whichusers with a range of levels of expertise were asked tocomplete a variety of tasks using the existing and new product.We measured the time they took, and obtained feedback ontheir subjective perceptions.

In some respects, this project has been a fool’s errand.We picked a product that was popular and widely used so asnot to be investing effort in analyzing a strawman design;we thought that its popularity would mean that a largeraudience would be interested in our experiment. In sharingour research with colleagues, however, we have discovereda significant polarization. Experts, who are deeply familiarwith the product, have learned its many intricacies, developedcomplex, customized workflows, and regularly exploit itsmost elaborate features, are often defensive and resistant tothe suggestion that the design has flaws. In contrast, lessintensive users, who have given up on understanding theproduct, and rely on only a handful of memorized commands,are so frustrated by their experience that an analysis like oursseems to them belaboring the obvious.

Nevertheless, we hope that the reader will approach ouranalysis with an open mind. Although our analysis andexperiment are far from perfect, we believe they contributenew ideas to an area that is important and much discussed bypractitioners, but rarely studied by the research community.

Subject Git, according to its webpage, is a free and opensource distributed version control system that is easy to learn,has a tiny footprint, lightning fast performance, and features

Gitless Update • [De Rosso and Jackson, 2016] • New Paper, same authors • OOPSLA 2016 • More examples with categorizations • Better comparisons

21D. Koop, CIS 602-01, Fall 2016

Page 22: CIS 602-01: Computational Reproducibilitydkoop/cis602-2016fa/lectures/lecture...w.r.t. prev. quarter) Independent Gender diversity (Blau index) Tenure diversity (coeff. variation)

Misfit Question Upvotes Views

Saving Changes

Q1 Using Git and Dropbox together effectively? 927 215523Q2 Backup a Local Git Repository 122 78674Q3 Fully backup a git repo? 54 37502Q4 Is it possible to push a git stash to a remote repository? 105 30820Q5 Git fatal: Reference has invalid format: refs/heads/master 90 25717Q6 Is “git push –mirror” sufficient for backing up my repository? 34 18415Q7 How to back up private branches in git 33 10580

SwitchingBranches

Q8 The following untracked working tree files would be overwritten by checkout 365 378331Q9 git: Switch branch and ignore any changes without committing 148 129120Q10 Why git keeps showing my changes when I switch branches (modified, added, deleted files) no matter if I run git add or not? 47 10524

Detached Head

Q11 Git: How can I reconcile detached HEAD with master/origin? 784 397694Q12 Fix a Git detached head? 490 397985Q13 Checkout GIT tag 125 98328Q14 git push says everything up-to-date even though I have local changes 113 79203Q15 Why did my Git repo enter a detached HEAD state? 202 78856Q16 Why did git set us on (no branch)? 65 41866Q17 gitx How do I get my ’Detached HEAD’ commits back into master 136 42794

File Rename

Q18 Handling file renames in git 315 242864Q19 Is it possible to move/rename files in git and maintain their history? 367 153701Q20 Why might git log not show history for a moved file, and what can I do about it? 34 17099Q21 How to REALLY show logs of renamed files with git? 60 12923

File Tracking Q22 Why does git commit not save my changes? 177 142189Q23 Git commit all files using single command 165 141815

Untracking File

Q24 Ignore files that have already been committed to a Git repository 1588 387112Q25 Stop tracking and ignore changes to a file in Git 975 353136Q26 Making git “forget” about a file that was tracked but is now in .gitignore 1458 286435Q27 git ignore files only locally 562 120700Q28 Untrack files from git 218 140663Q29 Git: How to remove file from index without deleting files from any repository 110 61498Q30 Ignore modified (but not committed) files in git? 135 38293Q31 Ignoring an already checked-in directory’s contents? 169 49692Q32 Apply git .gitignore rules to an existing repository [duplicate] 40 28286Q33 undo git update-index –assume-unchanged <file> 165 37262Q34 using gitignore to ignore (but not delete) files 55 23381Q35 How do you make Git ignore files without using .gitignore? 58 23709Q36 Can I get a list of files marked –assume-unchanged? 191 20184Q37 Keep file in a Git repo, but don’t track changes 74 15572Q38 Committing Machine Specific Configuration Files 58 5934

Empty DirectoryQ39 How can I add an empty directory to a Git repository? 2383 432218Q40 What are the differences between .gitignore and .gitkeep? 841 121484Q41 How to .gitignore all files/folder in a folder, but not the folder itself? [duplicate] 227 80119

Table 3: List of misfits with their related Stack Overflow questions

Some users expect branching to work just as in Gitless.In Q10, the OP states, “I thought that, while using branches,whatever you do in one branch, it’s invisible to all the otherbranches. Is not that the reason of creating branches?” InQ8-9, the OP is trying to switch branches, but uncommittedchanges prevent her from doing so.

For “detached head,” all questions (Q11-17) are of usersthat inadvertently got their repository into a detached headstate, are confused about it, and now need help to get theirrepository back to a sane state.

Questions for “file rename” (Q18-21) all arise from casesin which users are trying to figure out how to get Git to, asQ21 says, “really” track renames.

In Q22 the OP is confused about the staging area andwondering why commit doesn’t save the changes. In Q23,the OP wants to simply skip it altogether.

There’s a myriad of questions about how to untracka committed file (Q24-32, Q34-35, Q37-38). Those whofigured out that the way to do it is by marking the file as

assumed unchanged are left wondering how to list this kindof file (Q36) or how to undo the marking (Q33).

The need for sharing empty directories is so common (Q39,Q41) that there’s a convention to use the name .gitkeepfor the bogus file added to an empty directory, makingnovices wonder what the difference is between .gitkeepand .gitignore (Q40).

8.2 User StudyThrough the week of August 24-28, 2015, we conducteda usability test in which we recruited Git users and askthem to complete a series of short tasks using Git andGitless (a so-called “within-subjects design”). The goal of thestudy was to evaluate the usability impact of the conceptualtransformations applied in Gitless to address misfits.

Participants were recruited by an email sent through apublic lab mailing list composed mostly of current and alumnistudents, faculty, and research staff. In the study applicationwe asked applicants to rate their own proficiency using Git(“novice,” “regular user,” or “expert user”), indicate how

Git Misfits and Stack Overflow Questions

22D. Koop, CIS 602-01, Fall 2016

[De Rosso and Jackson, 2016]

Page 23: CIS 602-01: Computational Reproducibilitydkoop/cis602-2016fa/lectures/lecture...w.r.t. prev. quarter) Independent Gender diversity (Blau index) Tenure diversity (coeff. variation)

Task Evaluation

23D. Koop, CIS 602-01, Fall 2016

Task Description Misfit

1. Add readme file Create a new file (readme), track it, make another modification to it, and create a commit that includesall changes made to the file

File tracking

2. Let users input weight in kilos Create a new branch feat/kilos, switch to it, make a change and commit. We then ask them to makeanother change that is left uncommitted

No related misfit (see §8.2.1)

3. Let users input height in meters Create a new branch feat/meters, switch to it and make a change. The participant then needs to switchto master to fix a bug

Switching branches

4. Wrap with features Go back to working on the kilos feature, which involves switching to feat/kilos branch and bringingback uncommitted changes

Switching branches

5. Fixing conflicts Switch to another branch in the middle of conflicts Switching branches6. Code cleanup Undo an unpushed commit (as if it never existed before) Detached Head

Table 4: List of tasks with their related misfit

Task Success RateGit Gitless Difference

1. Add readme file 81.82% 100.00% 18.18%2. Let users input weight in kilos 90.91% 63.64% �27.27%3. Let users input height in meters 72.73% 81.82% 9.09%4. Wrap with features 54.55% 63.64% 9.09%5. Fixing conflicts 54.55% 90.91% 36.36%6. Code cleanup 63.63% 81.82% 18.90%

Table 5: Task success rates

we ran a k-means clustering algorithm (k=3) using Git taskcompletion times and got a total of 4 novices, 3 regularand 4 experts. We had planned to rely on our subjects’own classifications of their level of Git proficiency, but wefound this to be more subjective than we had anticipated,with their perceptions influenced by how they use Git andhow aware they are of what they do not know. We thereforedecided instead to establish a more objective classification byclustering them based on Git task completion times.

Task Success Rates Task success rates are shown in Table 5.Overall, participants did better using Gitless. In Task 2,participants that failed the task using Gitless (4) did sobecause they never switched to branch feat/kilos aftercreating it. We think the reason for this could be a confusionwith gl branch -c where most assumed that it would notonly create the branch given as input but also switch to it (likegit checkout -b) when in fact it does not.

Task Completion Times Task completion times are shownin Fig. 1. There is more variance in the completion time forGit. This is perhaps because of the different proficiency levelsparticipants had with Git, which caused them to strugglewith tasks in varying degrees. None of the participants hadused Gitless before, and the 3-minute overview created auniform understanding, so they all spent a similar amountof time doing the tasks. Most participants completed tasks3, 4, and 5 (which are all branching related) faster whenusing Gitless than when using Git. A paired t-test foundthe differences in task 5 to be significant (t=3.95, df=10,p=0.003). For task 4 the result was p=0.066. In all of thesetasks, having truly independent lines of development proveduseful. Some participants (1 novice, 1 regular, 1 expert)highlighted branching in Gitless: “Branch handling was

Git Gitless

12

34

56

78

Task

co

mp

letio

n t

ime

(m

inu

tes)

(a) Task 1

Git Gitless

45

67

89

Task

co

mp

letio

n t

ime

(m

inu

tes)

(b) Task 2

Git Gitless

46

81

01

2

Task

co

mp

letio

n t

ime

(m

inu

tes)

(c) Task 3

Git Gitless

51

01

5

Task

co

mp

letio

n t

ime

(m

inu

tes)

(d) Task 4

Git Gitless

51

01

52

0

Task

co

mp

letio

n t

ime

(m

inu

tes)

(e) Task 5

Git Gitless

46

81

01

21

4

Task

co

mp

letio

n t

ime

(m

inu

tes)

(f) Task 6

Figure 1: Box plots of task completion times

way more intuitive than with git. I would use gitless to dealwith branches”, “Keeping branches separate is great. [...]Transitions between branches are very smooth”, “I reallyenjoyed the fact that one can transition between brancheswithout committing or staging—that’s a killer feature.”

Questionnaire Results Questionnaire results are shownin Fig. 2. Overall, participants found Gitless more satis-fying than Git (Mgit=3.91, Mgl=5.09) and less frustrat-ing (Mgit=4.73, Mgl=2.91) but there’s no big differencein efficiency (Mgit=4.54, Mgl=4.91), difficulty (Mgit=3.45,Mgl=3.09) and confusion (Mgit=3.82, Mgl=3.72). This ap-parent contradiction might be due to the fact that all of theparticipants had used Git before but were encountering Git-less for the first time without any substantive training. Someparticipants (2 regular, 1 expert) commented that indeed theirproblems with Gitless were mostly due to their lack of prac-tice using it: “The hardest part was learning the new com-mands. With more experience, I can see how this could bea better way of using git”, “Overall, the frustrations I ran

Task Description Misfit

1. Add readme file Create a new file (readme), track it, make another modification to it, and create a commit that includesall changes made to the file

File tracking

2. Let users input weight in kilos Create a new branch feat/kilos, switch to it, make a change and commit. We then ask them to makeanother change that is left uncommitted

No related misfit (see §8.2.1)

3. Let users input height in meters Create a new branch feat/meters, switch to it and make a change. The participant then needs to switchto master to fix a bug

Switching branches

4. Wrap with features Go back to working on the kilos feature, which involves switching to feat/kilos branch and bringingback uncommitted changes

Switching branches

5. Fixing conflicts Switch to another branch in the middle of conflicts Switching branches6. Code cleanup Undo an unpushed commit (as if it never existed before) Detached Head

Table 4: List of tasks with their related misfit

Task Success RateGit Gitless Difference

1. Add readme file 81.82% 100.00% 18.18%2. Let users input weight in kilos 90.91% 63.64% �27.27%3. Let users input height in meters 72.73% 81.82% 9.09%4. Wrap with features 54.55% 63.64% 9.09%5. Fixing conflicts 54.55% 90.91% 36.36%6. Code cleanup 63.63% 81.82% 18.90%

Table 5: Task success rates

we ran a k-means clustering algorithm (k=3) using Git taskcompletion times and got a total of 4 novices, 3 regularand 4 experts. We had planned to rely on our subjects’own classifications of their level of Git proficiency, but wefound this to be more subjective than we had anticipated,with their perceptions influenced by how they use Git andhow aware they are of what they do not know. We thereforedecided instead to establish a more objective classification byclustering them based on Git task completion times.

Task Success Rates Task success rates are shown in Table 5.Overall, participants did better using Gitless. In Task 2,participants that failed the task using Gitless (4) did sobecause they never switched to branch feat/kilos aftercreating it. We think the reason for this could be a confusionwith gl branch -c where most assumed that it would notonly create the branch given as input but also switch to it (likegit checkout -b) when in fact it does not.

Task Completion Times Task completion times are shownin Fig. 1. There is more variance in the completion time forGit. This is perhaps because of the different proficiency levelsparticipants had with Git, which caused them to strugglewith tasks in varying degrees. None of the participants hadused Gitless before, and the 3-minute overview created auniform understanding, so they all spent a similar amountof time doing the tasks. Most participants completed tasks3, 4, and 5 (which are all branching related) faster whenusing Gitless than when using Git. A paired t-test foundthe differences in task 5 to be significant (t=3.95, df=10,p=0.003). For task 4 the result was p=0.066. In all of thesetasks, having truly independent lines of development proveduseful. Some participants (1 novice, 1 regular, 1 expert)highlighted branching in Gitless: “Branch handling was

Git Gitless

12

34

56

78

Task

com

ple

tion tim

e (

min

ute

s)

(a) Task 1

Git Gitless

45

67

89

Task

com

ple

tion tim

e (

min

ute

s)

(b) Task 2

Git Gitless

46

810

12

Task

com

ple

tion tim

e (

min

ute

s)

(c) Task 3

Git Gitless

510

15

Task

com

ple

tion tim

e (

min

ute

s)

(d) Task 4

Git Gitless

510

15

20

Task

com

ple

tion tim

e (

min

ute

s)

(e) Task 5

Git Gitless

46

810

12

14

Task

com

ple

tion tim

e (

min

ute

s)

(f) Task 6

Figure 1: Box plots of task completion times

way more intuitive than with git. I would use gitless to dealwith branches”, “Keeping branches separate is great. [...]Transitions between branches are very smooth”, “I reallyenjoyed the fact that one can transition between brancheswithout committing or staging—that’s a killer feature.”

Questionnaire Results Questionnaire results are shownin Fig. 2. Overall, participants found Gitless more satis-fying than Git (Mgit=3.91, Mgl=5.09) and less frustrat-ing (Mgit=4.73, Mgl=2.91) but there’s no big differencein efficiency (Mgit=4.54, Mgl=4.91), difficulty (Mgit=3.45,Mgl=3.09) and confusion (Mgit=3.82, Mgl=3.72). This ap-parent contradiction might be due to the fact that all of theparticipants had used Git before but were encountering Git-less for the first time without any substantive training. Someparticipants (2 regular, 1 expert) commented that indeed theirproblems with Gitless were mostly due to their lack of prac-tice using it: “The hardest part was learning the new com-mands. With more experience, I can see how this could bea better way of using git”, “Overall, the frustrations I ran

Task Description Misfit

1. Add readme file Create a new file (readme), track it, make another modification to it, and create a commit that includesall changes made to the file

File tracking

2. Let users input weight in kilos Create a new branch feat/kilos, switch to it, make a change and commit. We then ask them to makeanother change that is left uncommitted

No related misfit (see §8.2.1)

3. Let users input height in meters Create a new branch feat/meters, switch to it and make a change. The participant then needs to switchto master to fix a bug

Switching branches

4. Wrap with features Go back to working on the kilos feature, which involves switching to feat/kilos branch and bringingback uncommitted changes

Switching branches

5. Fixing conflicts Switch to another branch in the middle of conflicts Switching branches6. Code cleanup Undo an unpushed commit (as if it never existed before) Detached Head

Table 4: List of tasks with their related misfit

Task Success RateGit Gitless Difference

1. Add readme file 81.82% 100.00% 18.18%2. Let users input weight in kilos 90.91% 63.64% �27.27%3. Let users input height in meters 72.73% 81.82% 9.09%4. Wrap with features 54.55% 63.64% 9.09%5. Fixing conflicts 54.55% 90.91% 36.36%6. Code cleanup 63.63% 81.82% 18.90%

Table 5: Task success rates

we ran a k-means clustering algorithm (k=3) using Git taskcompletion times and got a total of 4 novices, 3 regularand 4 experts. We had planned to rely on our subjects’own classifications of their level of Git proficiency, but wefound this to be more subjective than we had anticipated,with their perceptions influenced by how they use Git andhow aware they are of what they do not know. We thereforedecided instead to establish a more objective classification byclustering them based on Git task completion times.

Task Success Rates Task success rates are shown in Table 5.Overall, participants did better using Gitless. In Task 2,participants that failed the task using Gitless (4) did sobecause they never switched to branch feat/kilos aftercreating it. We think the reason for this could be a confusionwith gl branch -c where most assumed that it would notonly create the branch given as input but also switch to it (likegit checkout -b) when in fact it does not.

Task Completion Times Task completion times are shownin Fig. 1. There is more variance in the completion time forGit. This is perhaps because of the different proficiency levelsparticipants had with Git, which caused them to strugglewith tasks in varying degrees. None of the participants hadused Gitless before, and the 3-minute overview created auniform understanding, so they all spent a similar amountof time doing the tasks. Most participants completed tasks3, 4, and 5 (which are all branching related) faster whenusing Gitless than when using Git. A paired t-test foundthe differences in task 5 to be significant (t=3.95, df=10,p=0.003). For task 4 the result was p=0.066. In all of thesetasks, having truly independent lines of development proveduseful. Some participants (1 novice, 1 regular, 1 expert)highlighted branching in Gitless: “Branch handling was

Git Gitless

12

34

56

78

Task

co

mp

letio

n t

ime

(m

inu

tes)

(a) Task 1

Git Gitless

45

67

89

Task

co

mp

letio

n t

ime

(m

inu

tes)

(b) Task 2

Git Gitless

46

81

01

2

Task

co

mp

letio

n t

ime

(m

inu

tes)

(c) Task 3

Git Gitless

51

01

5

Task

co

mp

letio

n t

ime

(m

inu

tes)

(d) Task 4

Git Gitless

51

01

52

0

Task

co

mp

letio

n t

ime

(m

inu

tes)

(e) Task 5

Git Gitless

46

81

01

21

4

Task

co

mp

letio

n t

ime

(m

inu

tes)

(f) Task 6

Figure 1: Box plots of task completion times

way more intuitive than with git. I would use gitless to dealwith branches”, “Keeping branches separate is great. [...]Transitions between branches are very smooth”, “I reallyenjoyed the fact that one can transition between brancheswithout committing or staging—that’s a killer feature.”

Questionnaire Results Questionnaire results are shownin Fig. 2. Overall, participants found Gitless more satis-fying than Git (Mgit=3.91, Mgl=5.09) and less frustrat-ing (Mgit=4.73, Mgl=2.91) but there’s no big differencein efficiency (Mgit=4.54, Mgl=4.91), difficulty (Mgit=3.45,Mgl=3.09) and confusion (Mgit=3.82, Mgl=3.72). This ap-parent contradiction might be due to the fact that all of theparticipants had used Git before but were encountering Git-less for the first time without any substantive training. Someparticipants (2 regular, 1 expert) commented that indeed theirproblems with Gitless were mostly due to their lack of prac-tice using it: “The hardest part was learning the new com-mands. With more experience, I can see how this could bea better way of using git”, “Overall, the frustrations I ran

[De Rosso and Jackson, 2016]

Page 24: CIS 602-01: Computational Reproducibilitydkoop/cis602-2016fa/lectures/lecture...w.r.t. prev. quarter) Independent Gender diversity (Blau index) Tenure diversity (coeff. variation)

1

2

3

4

5

6

7

Git Gitless

Satis

fact

ion

(1−7

)

Git Proficiency allnovice

regularexpert

(a) Satisfaction

1

2

3

4

5

6

7

Git GitlessEf

ficie

ncy

(1−7

)

Git Proficiency allnovice

regularexpert

(b) Efficiency

1

2

3

4

5

6

7

Git Gitless

Diff

icul

ty (1−7

)

Git Proficiency allnovice

regularexpert

(c) Difficulty

1

2

3

4

5

6

7

Git Gitless

Frus

tratio

n (1−7

)

Git Proficiency allnovice

regularexpert

(d) Frustration

1

2

3

4

5

6

7

Git Gitless

Con

fusi

on (1−7

)

Git Proficiency allnovice

regularexpert

(e) Confusion

1

2

3

4

5

6

7

I enjoyed using Gitless I found Gitless to beeasier to learn than Git

I found Gitless to beeasier to use than Git

I would continue usingGitless if I could

Scal

e 1−

7

Git Proficiency all novice regular expert

(f) Git versus Gitless

Figure 2: Post-session and post-study questionnaire results (1=strongly disagree, 4=neutral, 7=strongly agree), with standarderrors bars

into with gitless were because I wasn’t familiar enough yetto know the terms/commands, while my frustrations with gitwere due to a limitation of the tool”, “Most of what slowedme down was still thinking in git commands rather than git-less commands. [...] I have over 6 years of experience withgit and less than an hour with gitless.”

A paired t-test found the difference in satisfaction fornovices significant (t=-3.81, df=3, p=0.032). (p=0.134 for allproficiency levels.) We also found the difference in frustrationfor all proficiency levels and for novices significant (t=2.60,df=10, p=0.026 and t=3.81, df=3, p=0.032).

Results comparing Git with Gitless are encouraging.Novices specially liked it, while experts didn’t find it worsethan Git. Overall, participants enjoyed using Gitless (M=5.18)and found it easier to learn (M=4.91) and use (M=5.09). One(novice) participant stated “I found myself using status anddiff less often because the simplified workflow and termi-nology gave me greater confidence that my mental modelmatched Gitless’s.” When asked if they would continue usingGitless the results are somewhat split (M=4.45). Some (1regular, 1 expert), for example, showed concern about itspower: “Gitless was easier to use for the tasks these sessionsasked me to perform, but I really like having a Git stash andstaging area to work with in Git”, “[...] the ability to walkaway from a branch in any state is very useful and would gofar in helping new git users [...] However, I make heavy useof the staging area and interactive rebase and I would not be

willing to part with either.” These comments are not surpris-ing since Gitless is a mere prototype while Git has been inuse for over 10 years. (Also, at the time of the experiment,we didn’t have a partial flag to select segments of files tocommit, or a command to cleanup history.)

Note that while results suggest that our redesign effortswere fruitful (especially for novices, without a notable nega-tive impact on experts) this doesn’t mean Gitless is a “better”VCS than Git. Our study focused only on misfits and didso in a controlled environment. A full evaluation of a VCSwould require testing it in the context of large projects withcomplex requirements. Yet our results provide some empiri-cal evidence that suggests our approach can be profitable inaddressing design-related usability problems.

8.3 Threats to ValidityInternal In addition to the conceptual model, the type (e.g.,command language, direct manipulation), and quality of theuser interface affects usability. This is not a major factor in ourstudy, since Gitless has a command line interface that followsthe same Unix conventions as Git; the only differences are inthe command names (and of course their semantics).

External The user study was conducted on only 11 peoplethat are, or have previously been, affiliated with computerscience at MIT and may not generalize to Git users in general.To mitigate this factor Gitless is available online for free,and anyone can download and try the tool. Our findings may

Post-Study Survey Results

24D. Koop, CIS 602-01, Fall 2016

[De Rosso and Jackson, 2016]

Page 25: CIS 602-01: Computational Reproducibilitydkoop/cis602-2016fa/lectures/lecture...w.r.t. prev. quarter) Independent Gender diversity (Blau index) Tenure diversity (coeff. variation)

D. Koop, CIS 602-01, Fall 2016

The Conundrum of Sharing Research Data

C. L. Borgman

Page 26: CIS 602-01: Computational Reproducibilitydkoop/cis602-2016fa/lectures/lecture...w.r.t. prev. quarter) Independent Gender diversity (Blau index) Tenure diversity (coeff. variation)

Data Sharing• What is data sharing?

- "…the release of research data for use by others" • What is data?

- "broadly inclusive" - digital literature (e.g. games), data and databases requiring

computers and software (e.g. genomic sequencing, observational data (remote sensing), and generated or compiled information (by humans or machines)

- physical and life sciences: most gathered/produced by researchers - social sciences: gather/produce, also obtain from public records - humanities: records from human culture (archives)

• dataset: grouping, content, relatedness, purpose

26D. Koop, CIS 602-01, Fall 2016

Page 27: CIS 602-01: Computational Reproducibilitydkoop/cis602-2016fa/lectures/lecture...w.r.t. prev. quarter) Independent Gender diversity (Blau index) Tenure diversity (coeff. variation)

Categories of Data• Observational: e.g. weather, may go across time and location • Computational: data from computer model and simulation • Experimental: lab or field experiment (may be replicated) • Records: e.g. government records

• (via National Science Board)

27D. Koop, CIS 602-01, Fall 2016

Page 28: CIS 602-01: Computational Reproducibilitydkoop/cis602-2016fa/lectures/lecture...w.r.t. prev. quarter) Independent Gender diversity (Blau index) Tenure diversity (coeff. variation)

method is slow and too insensitive to distinguish betweenhuman and animal sources of bacteria. The more sophisti-cated method is quantitative polymerase chain reaction(qPCR), adapted from medical applications, which requiresgreater expertise and is much more expensive. This methodis faster and more sensitive, but results will vary betweenlaboratories due to choices of local protocols, filter material,machine type and model, and handling methods. Protocolsand results are shared between partner laboratories seekingto perfect the method, but little other than the methodsof data collection, protocols, and final curves might bereported in the journal articles. Biological samples arefragile; they degrade quickly or are destroyed in the analysisprocess.

At the other end of the specificity dimension are obser-vatories, which are institutions for the observation and inter-pretation of natural phenomena. Examples include NEONand LTER in ecology (National Ecological ObservatoryNetwork, 2010; U.S. Long Term Ecological ResearchNetwork, 2010; Porter, 2010), GEON in the earth sciences(GEON, 2011; Ribes & Bowker, 2008), and synoptic skysurveys in astronomy (Panoramic Survey Telescope &Rapid Response System, 2009; Large Synoptic SkyTelescope, 2010; Sloan Digital Sky Survey, 2010). Obser-vatories attempt to provide a comprehensive view of some

whole entity or system, such as the earth or sky. Globalclimate modeling, for example, depends upon consistentdata collection of climate phenomena around the world atagreed upon times, locations, and variables (Edwards,2010).

The value of observatories lies in systematically cap-turing the same set of observations over long periods oftime. Astronomical observatories are massive invest-ments, intended to serve a large community. Investigatorsand others can mine the data to ask their own questions or toidentify bases for comparison with data from other sources.Studies of the role of dust emission in star formation makeuse of observatory data. In this star dust scenario, a team ofastrophysics researchers queries several data collections thathold observations at different wavelengths, extracting manyyears of observations taken in a specific star-forming regionof interest. They apply several new methods of data analysisto model physical processes in star formation. By combiningdata from multiple observatories, they produce empiricalresults that enable them to propose a new theory. Typicallythe combined dataset is released when they publish thejournal article describing their results.

Scope of data collection. The second dimension ofFigure 1 is the scope of data collection. At one pole are

Observatory

Exploratory

Describe Phenomena Model System Empiric

al

Theoret

ical

Specificity of Purpose

Scope of Data Collection

Goal of Research

Scenarios:BQ: Beach QualitySD: Star DustOS: Online SurveyAR: Archival Records

SD

AROS

BQ

FIG. 1. Purposes for Collecting Data.

JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY—June 2012 1063DOI: 10.1002/asi

Purposes for Collecting Data

28D. Koop, CIS 602-01, Fall 2016

Page 29: CIS 602-01: Computational Reproducibilitydkoop/cis602-2016fa/lectures/lecture...w.r.t. prev. quarter) Independent Gender diversity (Blau index) Tenure diversity (coeff. variation)

Goal of Research• "Observations of the physical universe occur at a unique place and

time and can never be reconstructed."

29D. Koop, CIS 602-01, Fall 2016

Page 30: CIS 602-01: Computational Reproducibilitydkoop/cis602-2016fa/lectures/lecture...w.r.t. prev. quarter) Independent Gender diversity (Blau index) Tenure diversity (coeff. variation)

agree upon what data will be collected, by what techniquesand instruments, and who has the rights and responsibilitiesto analyze, publish, and release those data (Borgman,Bowker, Finholt, & Wallis, 2009; David, 2004; Olson,Zimmerman, & Bos, 2008; Ribes & Finholt, 2007). Thehistorian works alone with archival records. The sociologistconducting the online survey of student attitudes may workalone or with a small team of students and statisticians. Thebeach quality research is conducted by 5 to 10 graduate andundergraduate students, and led by a single investigator. Incontrast, dozens—if not hundreds or even thousands ofpeople around the world—may be involved in collecting andcurating data in observatories. Those who draw upon thedata from those observatories may be individuals or teams ofany size.

Labor to collect data. Approaches to data collection alsodiffer in the amount of human labor required—the seconddimension in Figure 2. Investigators in beach quality, marinebiology, or other field research may spend days, weeks, ormonths hand-gathering physical samples of soil, water, orplants, which then must be processed in a laboratory toextract data—a process that also may require days, weeks, or

months. Similarly, the historian may spend months or yearsin historical archives, taking notes on a laptop, or only withpencil on paper, by the rules of some archives. In this archi-val records scenario, the scholar may devote months or yearsto extracting useful data from those notes. These labor-intensive approaches have the advantage of flexibility andlocal control by the investigators. They have the disadvan-tages, from a data sharing perspective, of being difficult toreplicate and of producing data that are not consistent inform or structure.

Machine-collected observations, whether by telescopes,sensor networks, online survey software, or social networklogs, may be labor-intensive to design and develop, but oncedeployed can produce massive amounts of data that can beused by many people. Major telescopes, both on land and inspace, for example, require long-term collaborations amongscientists and technologists. Data structures, management,and curation plans are developed in parallel with the designof studies and instruments, a process of a decade or more.Machine-collected data tend to be consistent and structured,and to scale well, but considerable expertise is required tointerpret them. Conversely, these forms of data collectionare less flexible and adaptable to an individual investigator’s

Collaborative Team

s Individual Investigator

By Hand

By Machine By hand

By M

achine

People Involved

Labor to Collect Data

Labor to Process Data

Scenarios:BQ: Beach QualitySD: Star DustOS: Online SurveyAR: Archival Records

SD

BQ OS

AR

FIG. 2. Approaches to handling data.

JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY—June 2012 1065DOI: 10.1002/asi

Approaches in Handling Data

30D. Koop, CIS 602-01, Fall 2016

Page 31: CIS 602-01: Computational Reproducibilitydkoop/cis602-2016fa/lectures/lecture...w.r.t. prev. quarter) Independent Gender diversity (Blau index) Tenure diversity (coeff. variation)

Labor to Process Data• Machine-collected versus human

- Citizen science • "Generally speaking, the more handcrafted the data col- lection and

the more labor-intensive the postprocessing for interpretation, the less likely that researchers will share their data."

31D. Koop, CIS 602-01, Fall 2016

Page 32: CIS 602-01: Computational Reproducibilitydkoop/cis602-2016fa/lectures/lecture...w.r.t. prev. quarter) Independent Gender diversity (Blau index) Tenure diversity (coeff. variation)

benefit from the act of sharing data, such as the use of thosedata for a particular purpose (Merriam-Webster’s CollegiateDictionary, 2005).

Four rationales are presented in Figure 3, positioned ontwo axes. The sources for the model are the policy docu-ments and studies of data sharing cited herein, and theauthor’s participation in public discourse on these issues.The four rationales are to (a) reproduce or verify research,(b) make results of publicly funded research available to thepublic, (c) enable others to ask new questions of extant data,and (d) advance the state of research and innovation. Thedimensions on which these rationales are positioned arearguments for sharing and beneficiaries of sharing. Themodel is not exhaustive either in terms of rationales ordimensions, but is offered as a useful framework for exam-ining the complex interactions of players, policies, and prac-tices involved in sharing research data.

The arguments dimension (vertical axis) positions therationales by their emphasis on the needs of the researchcommunity or the needs of the public at large. Researchers,funding agencies, and journals often make different argu-ments for the value of sharing data. Motivations of the manystakeholders may be aligned, but often they are in conflict.

The beneficiaries dimension (horizontal axis) positionsthe rationales by their emphasis on benefits to researcherswho produce the data or benefits to those who might useresearch data. Here also, motivations of stakeholders may bealigned, but often they are in conflict. Funding agencies areresponsible to their research communities and to the public.Journals must serve their readers, their authors, and theirpublishers. Researchers’ incentives to release their own datamay or may not align with their motivations to gain access tothe data of others. Similarly, funding agencies’ and journals’motivations for data release may conflict with the incentivesof the researchers who create those data.

Neither dimension is absolute; the poles representrelative positions of people or situations. For example, aresearcher or policy maker may make one argument onbehalf of the producers of data and another on behalf of theusers. Similarly, an argument made in the name of scholar-ship may also serve the public good. These arguments andbeneficiaries are not mutually exclusive; rather, they providea two-dimensional space in which to place the variousrationales in favor of sharing research data.

Subtle distinctions in the rationales for data sharing maylead to markedly different policies, economic models,research practices, curation practices, and degrees ofcompliance. Of particular concern is how those rationalesalign with the incentives of those whose work produces thedata. Accordingly, discussion of the four rationales focusesmost heavily on the concerns of data producers and on theirabilities, motivations, and incentives to share their data.

The model proposed here is intended to provoke discus-sion among the many stakeholders in research data. Most ofthe examples are drawn from the sciences and social sciences,as these are the areas most studied and are on the front lines ofcurrent policy debates. This analysis can be extrapolated tothe humanities, where similar policies for data sharing areunder discussion (Kansa, Kansa, Burton, & Stankowski,2010; Unsworth et al., 2006). The ability to implement anydata sharing policy will depend on many factors, includinglocal data practices, differences in the intellectual propertyrights intrinsic to data sources, and the need to maintainconfidentiality of human subjects (Borgman, 2007).

To Reproduce or to Verify Research

Reproducibility or replication of research is viewed as“the gold standard” for science (Jasny, Chin, Chong, &Vignieri, 2011), yet it is the most problematic rationalefor sharing research data. This rationale is fundamentallyresearch driven but can also be viewed as serving the publicgood. Reproducing a study confirms the science, and indoing so confirms that public monies were well spent.However, the argument can be applied only to certain kindsof data and types of research, and rests upon several ques-tionable assumptions.

Pressure is mounting to share data for the purposes ofreproducing research findings. A recent special issue ofScience on replication and reproducibility examines theapproaches, benefits, and challenges across multiple fields(Ioannidis & Khoury, 2011; Jasny et al., 2011; Peng, 2011;Ryan, 2011; Santer, Wigley, & Taylor, 2011; Tomasello &Call, 2011). The authors encourage data sharing to increasethe likelihood of replication, while acknowledging thevery different methods and standards for reproducibility ineach field discussed. Particularly challenging are the“omics” fields (e.g., genomics, transcriptomics, proteom-ics, metabolomics), in which “clinically meaningful dis-coveries are hidden within millions of analyses” (Ioannidis& Khoury, 2011, p. 1230). Fine distinctions are madebetween reproducibility, validation, utility, replication, and

Arguments for Sharing

Ben

efic

iari

es o

f Sha

ring

Data Producers Data Users

Publ

ic-d

rive

nR

esea

rch-

driv

enRationalesR1: Reproduce/verifyR2: Serve public interestR3: Ask new questionsR4: Advance research

ch-d

rive

nR

essea

rc

Data Users

R1

R2

R3

R4

FIG. 3. Rationales for Sharing Research Data.

JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY—June 2012 1067DOI: 10.1002/asi

Rationales for Sharing Data

32D. Koop, CIS 602-01, Fall 2016

Page 33: CIS 602-01: Computational Reproducibilitydkoop/cis602-2016fa/lectures/lecture...w.r.t. prev. quarter) Independent Gender diversity (Blau index) Tenure diversity (coeff. variation)

Reproducibility• "…it is the most problematic rationale for sharing research data"! • Data is not enough • Cannot reduce research to "mechanistic procedures" • [Depends more on interpretation than data]

33D. Koop, CIS 602-01, Fall 2016

Page 34: CIS 602-01: Computational Reproducibilitydkoop/cis602-2016fa/lectures/lecture...w.r.t. prev. quarter) Independent Gender diversity (Blau index) Tenure diversity (coeff. variation)

Data Sharing• Difficult • Need to see use? • Data may rely on software • Streaming data?

34D. Koop, CIS 602-01, Fall 2016

Page 35: CIS 602-01: Computational Reproducibilitydkoop/cis602-2016fa/lectures/lecture...w.r.t. prev. quarter) Independent Gender diversity (Blau index) Tenure diversity (coeff. variation)

Next Time• Two papers by Vines et al. that study data availability in biology

papers • Reading Response

- Why is availability an issue? - What factors contribute to availability? - What solutions would help improve access to data? - How does data fit in with reproducibility?

35D. Koop, CIS 602-01, Fall 2016