61
Knowledge Spillovers in the Open Source Community Evidence in Github Tong Wang 1 Business School University of Edinbugh Toulouse Digital Seminar, 2017 Tong Wang (Universities of Edinburgh) Knowledge Spillovers in the Open Source Community Toulouse Digital Seminar, 2017 1 / 21

Knowledge Spillovers in the Open Source Community ... › sites › default › files › TSE › documents › Chaire… · Knowledge Spillovers in the Open Source Community Evidence

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Knowledge Spillovers in the Open Source Community ... › sites › default › files › TSE › documents › Chaire… · Knowledge Spillovers in the Open Source Community Evidence

Knowledge Spillovers in the Open Source CommunityEvidence in Github

Tong Wang

1Business SchoolUniversity of Edinbugh

Toulouse Digital Seminar, 2017

Tong Wang (Universities of Edinburgh) Knowledge Spillovers in the Open Source CommunityToulouse Digital Seminar, 2017 1 / 21

Page 2: Knowledge Spillovers in the Open Source Community ... › sites › default › files › TSE › documents › Chaire… · Knowledge Spillovers in the Open Source Community Evidence

MotivationKnowledge Transfer

Knowledge transfer is prevalence in academia and programmingcommunity

Product development in community-based organisations is becomingincreasingly important, e.g. quite a few commercial softwares havetheir open-source versions.

There are basically two ways of learning:

Interaction with individuals (e.g. email exchange, brain storm, etc.)Read and study other good projects (e.g. read academic papers, etc.)

Learning (knowledge spillover) could also be direct or indirect.

Tong Wang (Universities of Edinburgh) Knowledge Spillovers in the Open Source CommunityToulouse Digital Seminar, 2017 2 / 21

Page 3: Knowledge Spillovers in the Open Source Community ... › sites › default › files › TSE › documents › Chaire… · Knowledge Spillovers in the Open Source Community Evidence

MotivationKnowledge Transfer

Knowledge transfer is prevalence in academia and programmingcommunity

Product development in community-based organisations is becomingincreasingly important, e.g. quite a few commercial softwares havetheir open-source versions.

There are basically two ways of learning:

Interaction with individuals (e.g. email exchange, brain storm, etc.)Read and study other good projects (e.g. read academic papers, etc.)

Learning (knowledge spillover) could also be direct or indirect.

Tong Wang (Universities of Edinburgh) Knowledge Spillovers in the Open Source CommunityToulouse Digital Seminar, 2017 2 / 21

Page 4: Knowledge Spillovers in the Open Source Community ... › sites › default › files › TSE › documents › Chaire… · Knowledge Spillovers in the Open Source Community Evidence

MotivationKnowledge Transfer

Knowledge transfer is prevalence in academia and programmingcommunity

Product development in community-based organisations is becomingincreasingly important, e.g. quite a few commercial softwares havetheir open-source versions.

There are basically two ways of learning:

Interaction with individuals (e.g. email exchange, brain storm, etc.)Read and study other good projects (e.g. read academic papers, etc.)

Learning (knowledge spillover) could also be direct or indirect.

Tong Wang (Universities of Edinburgh) Knowledge Spillovers in the Open Source CommunityToulouse Digital Seminar, 2017 2 / 21

Page 5: Knowledge Spillovers in the Open Source Community ... › sites › default › files › TSE › documents › Chaire… · Knowledge Spillovers in the Open Source Community Evidence

MotivationKnowledge Transfer

Knowledge transfer is prevalence in academia and programmingcommunity

Product development in community-based organisations is becomingincreasingly important, e.g. quite a few commercial softwares havetheir open-source versions.

There are basically two ways of learning:

Interaction with individuals (e.g. email exchange, brain storm, etc.)

Read and study other good projects (e.g. read academic papers, etc.)

Learning (knowledge spillover) could also be direct or indirect.

Tong Wang (Universities of Edinburgh) Knowledge Spillovers in the Open Source CommunityToulouse Digital Seminar, 2017 2 / 21

Page 6: Knowledge Spillovers in the Open Source Community ... › sites › default › files › TSE › documents › Chaire… · Knowledge Spillovers in the Open Source Community Evidence

MotivationKnowledge Transfer

Knowledge transfer is prevalence in academia and programmingcommunity

Product development in community-based organisations is becomingincreasingly important, e.g. quite a few commercial softwares havetheir open-source versions.

There are basically two ways of learning:

Interaction with individuals (e.g. email exchange, brain storm, etc.)Read and study other good projects (e.g. read academic papers, etc.)

Learning (knowledge spillover) could also be direct or indirect.

Tong Wang (Universities of Edinburgh) Knowledge Spillovers in the Open Source CommunityToulouse Digital Seminar, 2017 2 / 21

Page 7: Knowledge Spillovers in the Open Source Community ... › sites › default › files › TSE › documents › Chaire… · Knowledge Spillovers in the Open Source Community Evidence

MotivationKnowledge Transfer

Knowledge transfer is prevalence in academia and programmingcommunity

Product development in community-based organisations is becomingincreasingly important, e.g. quite a few commercial softwares havetheir open-source versions.

There are basically two ways of learning:

Interaction with individuals (e.g. email exchange, brain storm, etc.)Read and study other good projects (e.g. read academic papers, etc.)

Learning (knowledge spillover) could also be direct or indirect.

Tong Wang (Universities of Edinburgh) Knowledge Spillovers in the Open Source CommunityToulouse Digital Seminar, 2017 2 / 21

Page 8: Knowledge Spillovers in the Open Source Community ... › sites › default › files › TSE › documents › Chaire… · Knowledge Spillovers in the Open Source Community Evidence

Motivation(Cont’d)

Factors that affect the effectiveness and efficiency of learningprocedure remain largely unexplored.

In a UGC(User generating Content)-like open-source community, dothe ways of learning change?

Tong Wang (Universities of Edinburgh) Knowledge Spillovers in the Open Source CommunityToulouse Digital Seminar, 2017 3 / 21

Page 9: Knowledge Spillovers in the Open Source Community ... › sites › default › files › TSE › documents › Chaire… · Knowledge Spillovers in the Open Source Community Evidence

Motivation(Cont’d)

Factors that affect the effectiveness and efficiency of learningprocedure remain largely unexplored.

In a UGC(User generating Content)-like open-source community, dothe ways of learning change?

Tong Wang (Universities of Edinburgh) Knowledge Spillovers in the Open Source CommunityToulouse Digital Seminar, 2017 3 / 21

Page 10: Knowledge Spillovers in the Open Source Community ... › sites › default › files › TSE › documents › Chaire… · Knowledge Spillovers in the Open Source Community Evidence

Literature Review

Grewal, Lilien, and Mallapragada (2006) investigates how the networkembeddedness of projects and project managers influences the successof projects;

Goyal, van der Leij, and Moraga-Gonzalez (2006) studies the networkproperty of coauthorship network in economics;

Manski (2000), Sacerdote (2001), and Angrist and Lang (2004) studyneighborhood effects and spillovers in many other aspects such aslabor and education.

Fershtman and Gandal (2011) studies knowledge spillover in anenvironment where the knowledge producers and knowledgeconsumers are different.

Neil Gandal and Uriel Stettner (2016) evaluates the importance ofprogram modification and fuction additional to project success.

Tong Wang (Universities of Edinburgh) Knowledge Spillovers in the Open Source CommunityToulouse Digital Seminar, 2017 4 / 21

Page 11: Knowledge Spillovers in the Open Source Community ... › sites › default › files › TSE › documents › Chaire… · Knowledge Spillovers in the Open Source Community Evidence

Literature Review

Grewal, Lilien, and Mallapragada (2006) investigates how the networkembeddedness of projects and project managers influences the successof projects;

Goyal, van der Leij, and Moraga-Gonzalez (2006) studies the networkproperty of coauthorship network in economics;

Manski (2000), Sacerdote (2001), and Angrist and Lang (2004) studyneighborhood effects and spillovers in many other aspects such aslabor and education.

Fershtman and Gandal (2011) studies knowledge spillover in anenvironment where the knowledge producers and knowledgeconsumers are different.

Neil Gandal and Uriel Stettner (2016) evaluates the importance ofprogram modification and fuction additional to project success.

Tong Wang (Universities of Edinburgh) Knowledge Spillovers in the Open Source CommunityToulouse Digital Seminar, 2017 4 / 21

Page 12: Knowledge Spillovers in the Open Source Community ... › sites › default › files › TSE › documents › Chaire… · Knowledge Spillovers in the Open Source Community Evidence

Literature Review

Grewal, Lilien, and Mallapragada (2006) investigates how the networkembeddedness of projects and project managers influences the successof projects;

Goyal, van der Leij, and Moraga-Gonzalez (2006) studies the networkproperty of coauthorship network in economics;

Manski (2000), Sacerdote (2001), and Angrist and Lang (2004) studyneighborhood effects and spillovers in many other aspects such aslabor and education.

Fershtman and Gandal (2011) studies knowledge spillover in anenvironment where the knowledge producers and knowledgeconsumers are different.

Neil Gandal and Uriel Stettner (2016) evaluates the importance ofprogram modification and fuction additional to project success.

Tong Wang (Universities of Edinburgh) Knowledge Spillovers in the Open Source CommunityToulouse Digital Seminar, 2017 4 / 21

Page 13: Knowledge Spillovers in the Open Source Community ... › sites › default › files › TSE › documents › Chaire… · Knowledge Spillovers in the Open Source Community Evidence

Literature Review

Grewal, Lilien, and Mallapragada (2006) investigates how the networkembeddedness of projects and project managers influences the successof projects;

Goyal, van der Leij, and Moraga-Gonzalez (2006) studies the networkproperty of coauthorship network in economics;

Manski (2000), Sacerdote (2001), and Angrist and Lang (2004) studyneighborhood effects and spillovers in many other aspects such aslabor and education.

Fershtman and Gandal (2011) studies knowledge spillover in anenvironment where the knowledge producers and knowledgeconsumers are different.

Neil Gandal and Uriel Stettner (2016) evaluates the importance ofprogram modification and fuction additional to project success.

Tong Wang (Universities of Edinburgh) Knowledge Spillovers in the Open Source CommunityToulouse Digital Seminar, 2017 4 / 21

Page 14: Knowledge Spillovers in the Open Source Community ... › sites › default › files › TSE › documents › Chaire… · Knowledge Spillovers in the Open Source Community Evidence

Literature Review

Grewal, Lilien, and Mallapragada (2006) investigates how the networkembeddedness of projects and project managers influences the successof projects;

Goyal, van der Leij, and Moraga-Gonzalez (2006) studies the networkproperty of coauthorship network in economics;

Manski (2000), Sacerdote (2001), and Angrist and Lang (2004) studyneighborhood effects and spillovers in many other aspects such aslabor and education.

Fershtman and Gandal (2011) studies knowledge spillover in anenvironment where the knowledge producers and knowledgeconsumers are different.

Neil Gandal and Uriel Stettner (2016) evaluates the importance ofprogram modification and fuction additional to project success.

Tong Wang (Universities of Edinburgh) Knowledge Spillovers in the Open Source CommunityToulouse Digital Seminar, 2017 4 / 21

Page 15: Knowledge Spillovers in the Open Source Community ... › sites › default › files › TSE › documents › Chaire… · Knowledge Spillovers in the Open Source Community Evidence

Contribution

Empirically examine the association between projectsuccess/popularity and network measures in a situation whereknowledge producers and consumers are the same group

Identify the importance of Activity and Effort in the learning process

Throw some light on social learning process.

One of the First attempts to use big data approach to analyzeknowledge spillover.

Tong Wang (Universities of Edinburgh) Knowledge Spillovers in the Open Source CommunityToulouse Digital Seminar, 2017 5 / 21

Page 16: Knowledge Spillovers in the Open Source Community ... › sites › default › files › TSE › documents › Chaire… · Knowledge Spillovers in the Open Source Community Evidence

Contribution

Empirically examine the association between projectsuccess/popularity and network measures in a situation whereknowledge producers and consumers are the same group

Identify the importance of Activity and Effort in the learning process

Throw some light on social learning process.

One of the First attempts to use big data approach to analyzeknowledge spillover.

Tong Wang (Universities of Edinburgh) Knowledge Spillovers in the Open Source CommunityToulouse Digital Seminar, 2017 5 / 21

Page 17: Knowledge Spillovers in the Open Source Community ... › sites › default › files › TSE › documents › Chaire… · Knowledge Spillovers in the Open Source Community Evidence

Contribution

Empirically examine the association between projectsuccess/popularity and network measures in a situation whereknowledge producers and consumers are the same group

Identify the importance of Activity and Effort in the learning process

Throw some light on social learning process.

One of the First attempts to use big data approach to analyzeknowledge spillover.

Tong Wang (Universities of Edinburgh) Knowledge Spillovers in the Open Source CommunityToulouse Digital Seminar, 2017 5 / 21

Page 18: Knowledge Spillovers in the Open Source Community ... › sites › default › files › TSE › documents › Chaire… · Knowledge Spillovers in the Open Source Community Evidence

Contribution

Empirically examine the association between projectsuccess/popularity and network measures in a situation whereknowledge producers and consumers are the same group

Identify the importance of Activity and Effort in the learning process

Throw some light on social learning process.

One of the First attempts to use big data approach to analyzeknowledge spillover.

Tong Wang (Universities of Edinburgh) Knowledge Spillovers in the Open Source CommunityToulouse Digital Seminar, 2017 5 / 21

Page 19: Knowledge Spillovers in the Open Source Community ... › sites › default › files › TSE › documents › Chaire… · Knowledge Spillovers in the Open Source Community Evidence

Data Source

Github offers a unique opportunity to examine our researchquestions

Github is a social coding platform

Github is the world largest code hosting and open-source development

platform

Every action on Github is recorded and thus is obtainable.

“fork” is a good measure for popularity.

More importantly, the network structure of Github is quite clear

Tong Wang (Universities of Edinburgh) Knowledge Spillovers in the Open Source CommunityToulouse Digital Seminar, 2017 6 / 21

Page 20: Knowledge Spillovers in the Open Source Community ... › sites › default › files › TSE › documents › Chaire… · Knowledge Spillovers in the Open Source Community Evidence

Data Source

Github offers a unique opportunity to examine our researchquestions

Github is a social coding platform

Github is the world largest code hosting and open-source development

platform

Every action on Github is recorded and thus is obtainable.

“fork” is a good measure for popularity.

More importantly, the network structure of Github is quite clear

Tong Wang (Universities of Edinburgh) Knowledge Spillovers in the Open Source CommunityToulouse Digital Seminar, 2017 6 / 21

Page 21: Knowledge Spillovers in the Open Source Community ... › sites › default › files › TSE › documents › Chaire… · Knowledge Spillovers in the Open Source Community Evidence

Data Source

Github offers a unique opportunity to examine our researchquestions

Github is a social coding platform

Github is the world largest code hosting and open-source development

platform

Every action on Github is recorded and thus is obtainable.

“fork” is a good measure for popularity.

More importantly, the network structure of Github is quite clear

Tong Wang (Universities of Edinburgh) Knowledge Spillovers in the Open Source CommunityToulouse Digital Seminar, 2017 6 / 21

Page 22: Knowledge Spillovers in the Open Source Community ... › sites › default › files › TSE › documents › Chaire… · Knowledge Spillovers in the Open Source Community Evidence

Data Source

Github offers a unique opportunity to examine our researchquestions

Github is a social coding platform

Github is the world largest code hosting and open-source development

platform

Every action on Github is recorded and thus is obtainable.

“fork” is a good measure for popularity.

More importantly, the network structure of Github is quite clear

Tong Wang (Universities of Edinburgh) Knowledge Spillovers in the Open Source CommunityToulouse Digital Seminar, 2017 6 / 21

Page 23: Knowledge Spillovers in the Open Source Community ... › sites › default › files › TSE › documents › Chaire… · Knowledge Spillovers in the Open Source Community Evidence

Data Source

Github offers a unique opportunity to examine our researchquestions

Github is a social coding platform

Github is the world largest code hosting and open-source development

platform

Every action on Github is recorded and thus is obtainable.

“fork” is a good measure for popularity.

More importantly, the network structure of Github is quite clear

Tong Wang (Universities of Edinburgh) Knowledge Spillovers in the Open Source CommunityToulouse Digital Seminar, 2017 6 / 21

Page 24: Knowledge Spillovers in the Open Source Community ... › sites › default › files › TSE › documents › Chaire… · Knowledge Spillovers in the Open Source Community Evidence

A sample page of a project on Github

Tong Wang (Universities of Edinburgh) Knowledge Spillovers in the Open Source CommunityToulouse Digital Seminar, 2017 7 / 21

Page 25: Knowledge Spillovers in the Open Source Community ... › sites › default › files › TSE › documents › Chaire… · Knowledge Spillovers in the Open Source Community Evidence

Data Sources

We obtained our data by SQL Querying Github Archive and networkspidering.

All projects and contributor information before Jun. 2016 wasretrieved and stored in the spider server, the raw data is 300G +.

The contributors and projects are identified by unique IDs.

Project is linked to at least one contributor. Contributors areconnected by their joint projects, so finally we get a two-modenetwork.

Tong Wang (Universities of Edinburgh) Knowledge Spillovers in the Open Source CommunityToulouse Digital Seminar, 2017 8 / 21

Page 26: Knowledge Spillovers in the Open Source Community ... › sites › default › files › TSE › documents › Chaire… · Knowledge Spillovers in the Open Source Community Evidence

Data Sources

We obtained our data by SQL Querying Github Archive and networkspidering.

All projects and contributor information before Jun. 2016 wasretrieved and stored in the spider server, the raw data is 300G +.

The contributors and projects are identified by unique IDs.

Project is linked to at least one contributor. Contributors areconnected by their joint projects, so finally we get a two-modenetwork.

Tong Wang (Universities of Edinburgh) Knowledge Spillovers in the Open Source CommunityToulouse Digital Seminar, 2017 8 / 21

Page 27: Knowledge Spillovers in the Open Source Community ... › sites › default › files › TSE › documents › Chaire… · Knowledge Spillovers in the Open Source Community Evidence

Data Sources

We obtained our data by SQL Querying Github Archive and networkspidering.

All projects and contributor information before Jun. 2016 wasretrieved and stored in the spider server, the raw data is 300G +.

The contributors and projects are identified by unique IDs.

Project is linked to at least one contributor. Contributors areconnected by their joint projects, so finally we get a two-modenetwork.

Tong Wang (Universities of Edinburgh) Knowledge Spillovers in the Open Source CommunityToulouse Digital Seminar, 2017 8 / 21

Page 28: Knowledge Spillovers in the Open Source Community ... › sites › default › files › TSE › documents › Chaire… · Knowledge Spillovers in the Open Source Community Evidence

Data Sources

We obtained our data by SQL Querying Github Archive and networkspidering.

All projects and contributor information before Jun. 2016 wasretrieved and stored in the spider server, the raw data is 300G +.

The contributors and projects are identified by unique IDs.

Project is linked to at least one contributor. Contributors areconnected by their joint projects, so finally we get a two-modenetwork.

Tong Wang (Universities of Edinburgh) Knowledge Spillovers in the Open Source CommunityToulouse Digital Seminar, 2017 8 / 21

Page 29: Knowledge Spillovers in the Open Source Community ... › sites › default › files › TSE › documents › Chaire… · Knowledge Spillovers in the Open Source Community Evidence

Network Illustration

Tong Wang (Universities of Edinburgh) Knowledge Spillovers in the Open Source CommunityToulouse Digital Seminar, 2017 9 / 21

Page 30: Knowledge Spillovers in the Open Source Community ... › sites › default › files › TSE › documents › Chaire… · Knowledge Spillovers in the Open Source Community Evidence

Descriptive Statistics: Observations

Most of the Github Projects only have 1 contributor, most of thecontributors only have 0 or 1 project.

Tong Wang (Universities of Edinburgh) Knowledge Spillovers in the Open Source CommunityToulouse Digital Seminar, 2017 10 /

21

Page 31: Knowledge Spillovers in the Open Source Community ... › sites › default › files › TSE › documents › Chaire… · Knowledge Spillovers in the Open Source Community Evidence

Descriptive Statistics: Observations(Cont’d)

Most of the projects raise little attention. Being popular is generally verydifficult in the Open-Source Community.

Tong Wang (Universities of Edinburgh) Knowledge Spillovers in the Open Source CommunityToulouse Digital Seminar, 2017 11 /

21

Page 32: Knowledge Spillovers in the Open Source Community ... › sites › default › files › TSE › documents › Chaire… · Knowledge Spillovers in the Open Source Community Evidence

The Project Network

The nodes of this network areprojects.

There is a link between twodifferent project nodes if thereare contributors who participatein both projects.

Each link may have a valuewhich reflects the number ofcontributors who participate inboth projects.

Tong Wang (Universities of Edinburgh) Knowledge Spillovers in the Open Source CommunityToulouse Digital Seminar, 2017 12 /

21

Page 33: Knowledge Spillovers in the Open Source Community ... › sites › default › files › TSE › documents › Chaire… · Knowledge Spillovers in the Open Source Community Evidence

The Project Network

The nodes of this network areprojects.

There is a link between twodifferent project nodes if thereare contributors who participatein both projects.

Each link may have a valuewhich reflects the number ofcontributors who participate inboth projects.

Tong Wang (Universities of Edinburgh) Knowledge Spillovers in the Open Source CommunityToulouse Digital Seminar, 2017 12 /

21

Page 34: Knowledge Spillovers in the Open Source Community ... › sites › default › files › TSE › documents › Chaire… · Knowledge Spillovers in the Open Source Community Evidence

The Project Network

The nodes of this network areprojects.

There is a link between twodifferent project nodes if thereare contributors who participatein both projects.

Each link may have a valuewhich reflects the number ofcontributors who participate inboth projects.

Tong Wang (Universities of Edinburgh) Knowledge Spillovers in the Open Source CommunityToulouse Digital Seminar, 2017 12 /

21

Page 35: Knowledge Spillovers in the Open Source Community ... › sites › default › files › TSE › documents › Chaire… · Knowledge Spillovers in the Open Source Community Evidence

Varibles

Popularity/Success Measurement: the number of forks

Activity: No commit during the last 18 months.

Node degree: measurement of the direct effect, the degree is literallythe number of projects with which has a direct link

Closeness Centrality as the measure of indirect effect, define closenesscentrality as C (i) = N−1

Σj∈Nd(i ,j)

Existing Period: the number of years that have elapsed since theproject first appeared

Number of Contributors: the number of contributors who participatedin the project

Popular Language: The most popular 5 languages used in TIOBEIndex

Number of Comments: A proxy variable for the effort of contributors

Tong Wang (Universities of Edinburgh) Knowledge Spillovers in the Open Source CommunityToulouse Digital Seminar, 2017 13 /

21

Page 36: Knowledge Spillovers in the Open Source Community ... › sites › default › files › TSE › documents › Chaire… · Knowledge Spillovers in the Open Source Community Evidence

Varibles

Popularity/Success Measurement: the number of forks

Activity: No commit during the last 18 months.

Node degree: measurement of the direct effect, the degree is literallythe number of projects with which has a direct link

Closeness Centrality as the measure of indirect effect, define closenesscentrality as C (i) = N−1

Σj∈Nd(i ,j)

Existing Period: the number of years that have elapsed since theproject first appeared

Number of Contributors: the number of contributors who participatedin the project

Popular Language: The most popular 5 languages used in TIOBEIndex

Number of Comments: A proxy variable for the effort of contributors

Tong Wang (Universities of Edinburgh) Knowledge Spillovers in the Open Source CommunityToulouse Digital Seminar, 2017 13 /

21

Page 37: Knowledge Spillovers in the Open Source Community ... › sites › default › files › TSE › documents › Chaire… · Knowledge Spillovers in the Open Source Community Evidence

Varibles

Popularity/Success Measurement: the number of forks

Activity: No commit during the last 18 months.

Node degree: measurement of the direct effect, the degree is literallythe number of projects with which has a direct link

Closeness Centrality as the measure of indirect effect, define closenesscentrality as C (i) = N−1

Σj∈Nd(i ,j)

Existing Period: the number of years that have elapsed since theproject first appeared

Number of Contributors: the number of contributors who participatedin the project

Popular Language: The most popular 5 languages used in TIOBEIndex

Number of Comments: A proxy variable for the effort of contributors

Tong Wang (Universities of Edinburgh) Knowledge Spillovers in the Open Source CommunityToulouse Digital Seminar, 2017 13 /

21

Page 38: Knowledge Spillovers in the Open Source Community ... › sites › default › files › TSE › documents › Chaire… · Knowledge Spillovers in the Open Source Community Evidence

Varibles

Popularity/Success Measurement: the number of forks

Activity: No commit during the last 18 months.

Node degree: measurement of the direct effect, the degree is literallythe number of projects with which has a direct link

Closeness Centrality as the measure of indirect effect, define closenesscentrality as C (i) = N−1

Σj∈Nd(i ,j)

Existing Period: the number of years that have elapsed since theproject first appeared

Number of Contributors: the number of contributors who participatedin the project

Popular Language: The most popular 5 languages used in TIOBEIndex

Number of Comments: A proxy variable for the effort of contributors

Tong Wang (Universities of Edinburgh) Knowledge Spillovers in the Open Source CommunityToulouse Digital Seminar, 2017 13 /

21

Page 39: Knowledge Spillovers in the Open Source Community ... › sites › default › files › TSE › documents › Chaire… · Knowledge Spillovers in the Open Source Community Evidence

Varibles

Popularity/Success Measurement: the number of forks

Activity: No commit during the last 18 months.

Node degree: measurement of the direct effect, the degree is literallythe number of projects with which has a direct link

Closeness Centrality as the measure of indirect effect, define closenesscentrality as C (i) = N−1

Σj∈Nd(i ,j)

Existing Period: the number of years that have elapsed since theproject first appeared

Number of Contributors: the number of contributors who participatedin the project

Popular Language: The most popular 5 languages used in TIOBEIndex

Number of Comments: A proxy variable for the effort of contributors

Tong Wang (Universities of Edinburgh) Knowledge Spillovers in the Open Source CommunityToulouse Digital Seminar, 2017 13 /

21

Page 40: Knowledge Spillovers in the Open Source Community ... › sites › default › files › TSE › documents › Chaire… · Knowledge Spillovers in the Open Source Community Evidence

Varibles

Popularity/Success Measurement: the number of forks

Activity: No commit during the last 18 months.

Node degree: measurement of the direct effect, the degree is literallythe number of projects with which has a direct link

Closeness Centrality as the measure of indirect effect, define closenesscentrality as C (i) = N−1

Σj∈Nd(i ,j)

Existing Period: the number of years that have elapsed since theproject first appeared

Number of Contributors: the number of contributors who participatedin the project

Popular Language: The most popular 5 languages used in TIOBEIndex

Number of Comments: A proxy variable for the effort of contributors

Tong Wang (Universities of Edinburgh) Knowledge Spillovers in the Open Source CommunityToulouse Digital Seminar, 2017 13 /

21

Page 41: Knowledge Spillovers in the Open Source Community ... › sites › default › files › TSE › documents › Chaire… · Knowledge Spillovers in the Open Source Community Evidence

Varibles

Popularity/Success Measurement: the number of forks

Activity: No commit during the last 18 months.

Node degree: measurement of the direct effect, the degree is literallythe number of projects with which has a direct link

Closeness Centrality as the measure of indirect effect, define closenesscentrality as C (i) = N−1

Σj∈Nd(i ,j)

Existing Period: the number of years that have elapsed since theproject first appeared

Number of Contributors: the number of contributors who participatedin the project

Popular Language: The most popular 5 languages used in TIOBEIndex

Number of Comments: A proxy variable for the effort of contributors

Tong Wang (Universities of Edinburgh) Knowledge Spillovers in the Open Source CommunityToulouse Digital Seminar, 2017 13 /

21

Page 42: Knowledge Spillovers in the Open Source Community ... › sites › default › files › TSE › documents › Chaire… · Knowledge Spillovers in the Open Source Community Evidence

Varibles

Popularity/Success Measurement: the number of forks

Activity: No commit during the last 18 months.

Node degree: measurement of the direct effect, the degree is literallythe number of projects with which has a direct link

Closeness Centrality as the measure of indirect effect, define closenesscentrality as C (i) = N−1

Σj∈Nd(i ,j)

Existing Period: the number of years that have elapsed since theproject first appeared

Number of Contributors: the number of contributors who participatedin the project

Popular Language: The most popular 5 languages used in TIOBEIndex

Number of Comments: A proxy variable for the effort of contributors

Tong Wang (Universities of Edinburgh) Knowledge Spillovers in the Open Source CommunityToulouse Digital Seminar, 2017 13 /

21

Page 43: Knowledge Spillovers in the Open Source Community ... › sites › default › files › TSE › documents › Chaire… · Knowledge Spillovers in the Open Source Community Evidence

Illustration of Closeness Centrality

Tong Wang (Universities of Edinburgh) Knowledge Spillovers in the Open Source CommunityToulouse Digital Seminar, 2017 14 /

21

Page 44: Knowledge Spillovers in the Open Source Community ... › sites › default › files › TSE › documents › Chaire… · Knowledge Spillovers in the Open Source Community Evidence

Preliminary Exploration: Neural Networks

Tong Wang (Universities of Edinburgh) Knowledge Spillovers in the Open Source CommunityToulouse Digital Seminar, 2017 15 /

21

Page 45: Knowledge Spillovers in the Open Source Community ... › sites › default › files › TSE › documents › Chaire… · Knowledge Spillovers in the Open Source Community Evidence

Basic Model Specification

Assume the initial success level is α and the final success level is Si .

If direct spillover effect makes sense, then a project with a higherdegree is more likely to be successful. That is, in the regressionfunction Si = α + βDi ; βshould be significant.

If indirect spillover effect makes sense, then the final success levelshould be:

Si = α +∑j

γ

d(i , j)+ βDi

since it is natural to assume that longer distance implies lowerimpact. But since we know C (i) = N−1

Σj∈Nd(i ,j) , so the equation above

can be transformed to :

Si = α +γ

N − 1C (i) + βDi

Tong Wang (Universities of Edinburgh) Knowledge Spillovers in the Open Source CommunityToulouse Digital Seminar, 2017 16 /

21

Page 46: Knowledge Spillovers in the Open Source Community ... › sites › default › files › TSE › documents › Chaire… · Knowledge Spillovers in the Open Source Community Evidence

Basic Model Specification

Assume the initial success level is α and the final success level is Si .

If direct spillover effect makes sense, then a project with a higherdegree is more likely to be successful. That is, in the regressionfunction Si = α + βDi ; βshould be significant.

If indirect spillover effect makes sense, then the final success levelshould be:

Si = α +∑j

γ

d(i , j)+ βDi

since it is natural to assume that longer distance implies lowerimpact. But since we know C (i) = N−1

Σj∈Nd(i ,j) , so the equation above

can be transformed to :

Si = α +γ

N − 1C (i) + βDi

Tong Wang (Universities of Edinburgh) Knowledge Spillovers in the Open Source CommunityToulouse Digital Seminar, 2017 16 /

21

Page 47: Knowledge Spillovers in the Open Source Community ... › sites › default › files › TSE › documents › Chaire… · Knowledge Spillovers in the Open Source Community Evidence

Basic Model Specification

Assume the initial success level is α and the final success level is Si .

If direct spillover effect makes sense, then a project with a higherdegree is more likely to be successful. That is, in the regressionfunction Si = α + βDi ; βshould be significant.

If indirect spillover effect makes sense, then the final success levelshould be:

Si = α +∑j

γ

d(i , j)+ βDi

since it is natural to assume that longer distance implies lowerimpact. But since we know C (i) = N−1

Σj∈Nd(i ,j) , so the equation above

can be transformed to :

Si = α +γ

N − 1C (i) + βDi

Tong Wang (Universities of Edinburgh) Knowledge Spillovers in the Open Source CommunityToulouse Digital Seminar, 2017 16 /

21

Page 48: Knowledge Spillovers in the Open Source Community ... › sites › default › files › TSE › documents › Chaire… · Knowledge Spillovers in the Open Source Community Evidence

Main results: Basic Model

Table: Effect of Activity

Dependent variable:

lfork

(1) (2)

lCPP 0.208∗∗∗ (0.002) 0.203∗∗∗ (0.002)lyear −0.929∗∗∗ (0.005) −0.920∗∗∗ (0.005)activity 0.339∗∗∗ (0.003) 0.146∗∗∗ (0.006)ldegree 0.082∗∗∗ (0.001) 0.036∗∗∗ (0.001)lcloseness −0.106∗∗∗ (0.020) −0.057∗ (0.031)lNO Comments 0.312∗∗∗ (0.002) 0.303∗∗∗ (0.002)popular 0.019∗∗∗ (0.002) 0.019∗∗∗ (0.002)activity:ldegree 0.074∗∗∗ (0.002)activity:lcloseness −0.013 (0.041)Constant 2.687∗∗∗ (0.009) 2.783∗∗∗ (0.010)

Observations 691,582 691,582R2 0.251 0.254Adjusted R2 0.251 0.254Residual Std. Error 0.914 (df = 691574) 0.912 (df = 691572)F Statistic 33,077.240∗∗∗ (df = 7; 691574) 26,125.280∗∗∗ (df = 9; 691572)

Note: ∗p<0.1; ∗∗p<0.05; ∗∗∗p<0.01

Tong Wang (Universities of Edinburgh) Knowledge Spillovers in the Open Source CommunityToulouse Digital Seminar, 2017 17 /

21

Page 49: Knowledge Spillovers in the Open Source Community ... › sites › default › files › TSE › documents › Chaire… · Knowledge Spillovers in the Open Source Community Evidence

Main results(Cont’d)

Table: Effects of Effort

Dependent variable:

lfork

(1) (2)

lCPP 0.208∗∗∗ (0.002) 0.197∗∗∗ (0.002)lyear −0.929∗∗∗ (0.005) −0.917∗∗∗ (0.005)activity 0.339∗∗∗ (0.003) 0.342∗∗∗ (0.003)ldegree 0.082∗∗∗ (0.001) 0.066∗∗∗ (0.001)lcloseness −0.106∗∗∗ (0.020) −0.056∗∗∗ (0.021)lNO Comments 0.312∗∗∗ (0.002) −0.048∗∗∗ (0.008)popular 0.019∗∗∗ (0.002) 0.019∗∗∗ (0.002)ldegree:lNO Comments 0.047∗∗∗ (0.001)lNO Comments:lcloseness 0.782∗∗∗ (0.047)Constant 2.687∗∗∗ (0.009) 2.709∗∗∗ (0.009)

Observations 691,582 691,582R2 0.251 0.256Adjusted R2 0.251 0.256Residual Std. Error 0.914 (df = 691574) 0.910 (df = 691572)F Statistic 33,077.240∗∗∗ (df = 7; 691574) 26,508.230∗∗∗ (df = 9; 691572)

Note: ∗p<0.1; ∗∗p<0.05; ∗∗∗p<0.01

Tong Wang (Universities of Edinburgh) Knowledge Spillovers in the Open Source CommunityToulouse Digital Seminar, 2017 18 /

21

Page 50: Knowledge Spillovers in the Open Source Community ... › sites › default › files › TSE › documents › Chaire… · Knowledge Spillovers in the Open Source Community Evidence

Discussion

A longer exposure helps.

More contributors, more forks.

Direct project spillover effect is significantly positive.

Knowledge spillover from inactive projects to active projects.

Manager’s effort is crucial to activate indirect spillovers.

Tong Wang (Universities of Edinburgh) Knowledge Spillovers in the Open Source CommunityToulouse Digital Seminar, 2017 19 /

21

Page 51: Knowledge Spillovers in the Open Source Community ... › sites › default › files › TSE › documents › Chaire… · Knowledge Spillovers in the Open Source Community Evidence

Discussion

A longer exposure helps.

More contributors, more forks.

Direct project spillover effect is significantly positive.

Knowledge spillover from inactive projects to active projects.

Manager’s effort is crucial to activate indirect spillovers.

Tong Wang (Universities of Edinburgh) Knowledge Spillovers in the Open Source CommunityToulouse Digital Seminar, 2017 19 /

21

Page 52: Knowledge Spillovers in the Open Source Community ... › sites › default › files › TSE › documents › Chaire… · Knowledge Spillovers in the Open Source Community Evidence

Discussion

A longer exposure helps.

More contributors, more forks.

Direct project spillover effect is significantly positive.

Knowledge spillover from inactive projects to active projects.

Manager’s effort is crucial to activate indirect spillovers.

Tong Wang (Universities of Edinburgh) Knowledge Spillovers in the Open Source CommunityToulouse Digital Seminar, 2017 19 /

21

Page 53: Knowledge Spillovers in the Open Source Community ... › sites › default › files › TSE › documents › Chaire… · Knowledge Spillovers in the Open Source Community Evidence

Discussion

A longer exposure helps.

More contributors, more forks.

Direct project spillover effect is significantly positive.

Knowledge spillover from inactive projects to active projects.

Manager’s effort is crucial to activate indirect spillovers.

Tong Wang (Universities of Edinburgh) Knowledge Spillovers in the Open Source CommunityToulouse Digital Seminar, 2017 19 /

21

Page 54: Knowledge Spillovers in the Open Source Community ... › sites › default › files › TSE › documents › Chaire… · Knowledge Spillovers in the Open Source Community Evidence

Discussion

A longer exposure helps.

More contributors, more forks.

Direct project spillover effect is significantly positive.

Knowledge spillover from inactive projects to active projects.

Manager’s effort is crucial to activate indirect spillovers.

Tong Wang (Universities of Edinburgh) Knowledge Spillovers in the Open Source CommunityToulouse Digital Seminar, 2017 19 /

21

Page 55: Knowledge Spillovers in the Open Source Community ... › sites › default › files › TSE › documents › Chaire… · Knowledge Spillovers in the Open Source Community Evidence

Concluding Remarks

the knowledge distribution pattern on Social Coding platform is quitedifferent with that on traditional open source community.

indirect spillovers are generally weaker.

repo manager’s effort can make a lot of difference.

Tong Wang (Universities of Edinburgh) Knowledge Spillovers in the Open Source CommunityToulouse Digital Seminar, 2017 20 /

21

Page 56: Knowledge Spillovers in the Open Source Community ... › sites › default › files › TSE › documents › Chaire… · Knowledge Spillovers in the Open Source Community Evidence

Concluding Remarks

the knowledge distribution pattern on Social Coding platform is quitedifferent with that on traditional open source community.

indirect spillovers are generally weaker.

repo manager’s effort can make a lot of difference.

Tong Wang (Universities of Edinburgh) Knowledge Spillovers in the Open Source CommunityToulouse Digital Seminar, 2017 20 /

21

Page 57: Knowledge Spillovers in the Open Source Community ... › sites › default › files › TSE › documents › Chaire… · Knowledge Spillovers in the Open Source Community Evidence

Concluding Remarks

the knowledge distribution pattern on Social Coding platform is quitedifferent with that on traditional open source community.

indirect spillovers are generally weaker.

repo manager’s effort can make a lot of difference.

Tong Wang (Universities of Edinburgh) Knowledge Spillovers in the Open Source CommunityToulouse Digital Seminar, 2017 20 /

21

Page 58: Knowledge Spillovers in the Open Source Community ... › sites › default › files › TSE › documents › Chaire… · Knowledge Spillovers in the Open Source Community Evidence

Future works

compare spillover effects of strong connections and weak connections

Explore the effect of bot-contributor

adding more control variables, such as project size and projectlanguage

Try other measurement of success

Tong Wang (Universities of Edinburgh) Knowledge Spillovers in the Open Source CommunityToulouse Digital Seminar, 2017 21 /

21

Page 59: Knowledge Spillovers in the Open Source Community ... › sites › default › files › TSE › documents › Chaire… · Knowledge Spillovers in the Open Source Community Evidence

Future works

compare spillover effects of strong connections and weak connections

Explore the effect of bot-contributor

adding more control variables, such as project size and projectlanguage

Try other measurement of success

Tong Wang (Universities of Edinburgh) Knowledge Spillovers in the Open Source CommunityToulouse Digital Seminar, 2017 21 /

21

Page 60: Knowledge Spillovers in the Open Source Community ... › sites › default › files › TSE › documents › Chaire… · Knowledge Spillovers in the Open Source Community Evidence

Future works

compare spillover effects of strong connections and weak connections

Explore the effect of bot-contributor

adding more control variables, such as project size and projectlanguage

Try other measurement of success

Tong Wang (Universities of Edinburgh) Knowledge Spillovers in the Open Source CommunityToulouse Digital Seminar, 2017 21 /

21

Page 61: Knowledge Spillovers in the Open Source Community ... › sites › default › files › TSE › documents › Chaire… · Knowledge Spillovers in the Open Source Community Evidence

Future works

compare spillover effects of strong connections and weak connections

Explore the effect of bot-contributor

adding more control variables, such as project size and projectlanguage

Try other measurement of success

Tong Wang (Universities of Edinburgh) Knowledge Spillovers in the Open Source CommunityToulouse Digital Seminar, 2017 21 /

21