19
The Operationalisation of Collaboration: in Search of a Definition and Its Consequences On Analysis Dawn M. Foster, Guido Conaldi, Riccardo De Vita Sunbelt XXXV June 2015

Operationalisation of Collaboration Sunbelt 2015

Embed Size (px)

Citation preview

The Operationalisation of Collaboration: in Search of a Definition and Its Consequences On Analysis

Dawn M. Foster, Guido Conaldi, Riccardo De VitaSunbelt XXXV June 2015

The Context

Pilot Study - define collaboration

Part of Larger Research Project - PhD Dissertation

Research Question for Overall Research Project:

• How do software developers, who are paid by organizations for their work, collaborate within an open source software community?

2

The Challenge

Open source software is a collaborative effort

But, collaboration takes many forms

And is defined in various ways

Which definitions are most important?

3

Literature on problem solving in open source

Unlikely organizations: survival depends on willingness to engage in decentralized problem solving.(e.g., Crowston & Scozzi, 2008; Mockus et , 2002, Conaldi et al. 2012)

Collaboration in problem solving investigated using digital traces: email, code, bug reports, mostly separately, or multidimensionally.(e.g., Von Krogh, G., Spaeth, S., & Lakhani, K. R. 2003)

Contributions close in time as proxies for collaboration.

4

The ApproachSmall Pilot Study

• Interviewed 4 participants

• Explored possible definitions of collaboration

• Analysis of responses

Network Analysis

• Ego-centric relational event histories for each pilot participant

• Collaboration as defined in pilot study

5

Research SettingLinux kernel community:

• Open source software

• Over 85% of contributors are paid

• Neutral: competing companies contribute

• 19M lines of code, 11K developers, 1200 organisations

Pilot Research Question:

• How do definitions of collaboration impact measurement and analysis within a decentralised organisational context?

6

DataMailing list collaboration (discussion, patches, bugs)

• 4 mailing lists used by pilot participants

• Ego-net focus

• History of events reconstructed

• Basic descriptive stats

Code file collaboration

• Code files modified by pilot participants

• History of events reconstructed

• Basic descriptive stats7

Methods: Activity

In our (very) preliminary analysis as actor-level measures of activity we measured:

Mailing lists:• Weighted degree centrality of contributors to capture their

involvement in the discussion of development topics

Code files:• Weighted degree centrality of contributors to capture their

activity in code production

8

Methods: Collaboration

In our preliminary analysis as actor-level measures of collaboration we measured:

Mailing lists: • Number of 2-paths: to capture the amount of participation

by others in development topics discussed by contributors

Code files: • Number of 2-paths: to capture the amount of contribution

by others to files being worked on by contributors

9

Results: Collaboration in the Linux kernel

In person (events)

Feedback on code contributions aka patches (mailing list)

General mailing list discussions

Feedback on bugs (mailing list)

Working on same code file(s)

10

Time (Weeks)

Wei

ghte

d D

egre

e

0 10 20 30 40 50 60

010

2030

4050

60 1234

Mailing Lists

Results: Weighted Degree Centrality

Code files

11

Time (Weeks)

Wei

ghte

d D

egre

e

0 10 20 30 40 50 60

010

0020

0030

0040

00 1234

Mailing Lists

Results: Two-Path

Code files

Time (Weeks)

Two−

path

s

0 10 20 30 40 50 60

020

4060

8010

012

0 1234

12

Time (Weeks)

Wei

ghte

d D

egre

e

0 10 20 30 40 50 60

050

0010

000

1500

020

000 1

234

Implications and Relevance

Collaboration is multiplex in the eyes of the contributors

The inspection of activity and (potential) collaboration in mailing lists and code show complementary pictures

Ability to identify contributors and their actions across multiple activities of code production is paramount if we want to study the structuring of collaboration

13

Discussion and Future Work

Face-to-face collaboration: how to capture it?

Identities across multiple online repositories

Validation

14

Thank You and Questions

Authors: Dawn M. Foster [email protected] Conaldi [email protected] De Vita [email protected]

University of Greenwich, Centre for Business Network Analysis

15

ReferencesData on Linux kernel contributions:• Corbet, J., Kroah-Hartman, G. & McPherson, A., 2015. Linux Kernel Development: How

Fast is it Going, Who is Doing It, What Are They Doing and Who is Sponsoring the Work, Available at: http://www.linuxfoundation.org/publications/linux-foundation/who-writes-linux-2015.

Literature:• Crowston, K., & Scozzi B. (2008). Bug Fixing Practices within Free/Libre Open Source

Software Development Teams. Journal of Database Management. 19(2), 1–30.• Mockus, A., Fielding, R.T. & Herbsleb, J.D., 2002. Two case studies of open source

software development: Apache and Mozilla. ACM Transactions on Software Engineering and Methodology, 11(3), pp. 309–346.

• Conaldi, G., Lomi, A. & Tonellato, M., 2012. Dynamic models of affiliation and the network structure of problem solving in an open source software project. Organizational Research Methods, 15(3), pp. 385–412.

• Von Krogh, G., Spaeth, S., & Lakhani, K. R., 2003. Community, joining, and specialization in open source software innovation: a case study. Research Policy, 32(7), pp. 1217-1241.

Backup

18

Mailing Lists

Results: Two-Path Repeated

Code files

Time (Weeks)

Rep

eate

d tw

o−pa

ths

0 10 20 30 40 50 60

05

1015

1234

Time (Weeks)

Rep

eate

d tw

o−pa

ths

0 20 40 60

020

4060

8010

012

0

1234

19