15
Scientific collaboration patterns vary with scholars’ academic ages Wei Wang 1,2 Shuo Yu 1,2 Teshome Megersa Bekele 1,2 Xiangjie Kong 1,2 Feng Xia 1,2 Received: 28 November 2016 / Published online: 19 April 2017 Ó Akade ´miai Kiado ´, Budapest, Hungary 2017 Abstract Scientists may encounter many collaborators of different academic ages throughout their careers. Thus, they are required to make essential decisions to commence or end a creative partnership. This process can be influenced by strategic motivations because young scholars are pursuers while senior scholars are normally attractors during new collaborative opportunities. While previous works have mainly focused on cross- sectional collaboration patterns, this work investigates scientific collaboration networks from scholars’ local perspectives based on their academic ages. We aim to harness the power of big scholarly data to investigate scholars’ academic-age-aware collaboration patterns. From more than 621,493 scholars and 2,646,941 collaboration records in Physics and Computer Science, we discover several interesting academic-age-aware behaviors. First, in a given time period, the academic age distribution follows the long-tail distri- bution, where more than 80% scholars are of young age. Second, with the increasing of academic age, the degree centrality of scholars goes up accordingly, which means that senior scholars tend to have more collaborators. Third, based on the collaboration fre- quency and distribution between scholars of different academic ages, we observe an obvious homophily phenomenon in scientific collaborations. Fourth, the scientific col- laboration triads are mostly consisted with beginning scholars. Furthermore, the differ- ences in collaboration patterns between these two fields in terms of academic age are discussed. Keywords Scientific collaboration Academic age Collaboration pattern & Xiangjie Kong [email protected] 1 School of Software, Dalian University of Technology, Dalian 116620, China 2 Key Laboratory for Ubiquitous Network and Service Software of Liaoning Province, Dalian, China 123 Scientometrics (2017) 112:329–343 DOI 10.1007/s11192-017-2388-9

Scientific collaboration patterns vary with scholars’ academic agesthealphalab.org/papers/Scientific collaboration patterns... · 2019. 1. 14. · groups, namely, Beginning scholars

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Scientific collaboration patterns vary with scholars’ academic agesthealphalab.org/papers/Scientific collaboration patterns... · 2019. 1. 14. · groups, namely, Beginning scholars

Scientific collaboration patterns vary with scholars’academic ages

Wei Wang1,2 • Shuo Yu1,2 • Teshome Megersa Bekele1,2 •

Xiangjie Kong1,2 • Feng Xia1,2

Received: 28 November 2016 / Published online: 19 April 2017� Akademiai Kiado, Budapest, Hungary 2017

Abstract Scientists may encounter many collaborators of different academic ages

throughout their careers. Thus, they are required to make essential decisions to commence

or end a creative partnership. This process can be influenced by strategic motivations

because young scholars are pursuers while senior scholars are normally attractors during

new collaborative opportunities. While previous works have mainly focused on cross-

sectional collaboration patterns, this work investigates scientific collaboration networks

from scholars’ local perspectives based on their academic ages. We aim to harness the

power of big scholarly data to investigate scholars’ academic-age-aware collaboration

patterns. From more than 621,493 scholars and 2,646,941 collaboration records in Physics

and Computer Science, we discover several interesting academic-age-aware behaviors.

First, in a given time period, the academic age distribution follows the long-tail distri-

bution, where more than 80% scholars are of young age. Second, with the increasing of

academic age, the degree centrality of scholars goes up accordingly, which means that

senior scholars tend to have more collaborators. Third, based on the collaboration fre-

quency and distribution between scholars of different academic ages, we observe an

obvious homophily phenomenon in scientific collaborations. Fourth, the scientific col-

laboration triads are mostly consisted with beginning scholars. Furthermore, the differ-

ences in collaboration patterns between these two fields in terms of academic age are

discussed.

Keywords Scientific collaboration � Academic age � Collaboration pattern

& Xiangjie [email protected]

1 School of Software, Dalian University of Technology, Dalian 116620, China

2 Key Laboratory for Ubiquitous Network and Service Software of Liaoning Province, Dalian, China

123

Scientometrics (2017) 112:329–343DOI 10.1007/s11192-017-2388-9

Page 2: Scientific collaboration patterns vary with scholars’ academic agesthealphalab.org/papers/Scientific collaboration patterns... · 2019. 1. 14. · groups, namely, Beginning scholars

Introduction

Collaboration is often a critical component of scientific research because an individual

scholar may not possess all the time, expertise and knowledge to effectively address a

complex research work. Researchers can benefit from collaborating to exchange ideas,

share skills, and save time. Meanwhile, recent studies have demonstrated that research

collaboration has a positive effect on publishing productivity, and productive scientists

tend to be more cooperative (Lee and Bozeman 2005; Xia et al. 2014). Academia is also

becoming more and more cooperative. Consequently, scientists pay more and more

attentions to comprehensively understand the underlying patterns and laws of scientific

collaborations (Newman 2001c; Guimera et al. 2005; Wuchty et al. 2007; Ke and Ahn

2014; Petersen 2015; Kong et al. 2016).

Nevertheless, most of the previous research mainly focuses on macroscopic properties

of collaboration networks such as network structure (Newman 2001c, 2004; Badar et al.

2015), network formation (Newman 2001b; Guimera et al. 2005), tie strength distribu-

tion (Ke and Ahn 2014), and team science (Milojevic 2014). Despite these studies, little is

known about how attributes of local social network affect scholars’ academic strategies in

choosing collaborators. For instance, the questions that how frequently a beginning scholar

collaborates with junior or senior scholars in his/her egocentric network, and which col-

laborative relationship is stable remain unaddressed.

Recently, however, researchers have begun analyzing scientific careers from an ego-

centric perspective (Petersen et al. 2014; Ortega 2014; Borrett et al. 2014; Cavusoglu and

Turker 2014; Petersen 2015; King et al. 2016). For example, in order to quantify collab-

oration duration frequency distributions and tie strength, Petersen (2015) studied the

longitudinal aspects of scientific collaboration from an egocentric perspective. This study

shows that weak ties dominate collaboration networks and super ties contribute to above-

average productivity. But, the scale of his network is limited, and he also overlooks the

varying research beginning time points of different scholars. As can be seen from Fig. 1,

different scholars begin their academic careers differently and their strategic decisions also

vary over different times. This may result from changes in the academic environment and

Fig. 1 Academic careers at different times

330 Scientometrics (2017) 112:329–343

123

Page 3: Scientific collaboration patterns vary with scholars’ academic agesthealphalab.org/papers/Scientific collaboration patterns... · 2019. 1. 14. · groups, namely, Beginning scholars

the development of communication technology. This work takes advantage of scientific

collaboration networks of two big scholarly data sets, i.e., APS and DBLP including

621,493 scholars and 2,646,941 collaboration records to study scholars’ academic-age-

aware strategies and their dynamics in different historical times.

A few work has been done on the analysis of scientific collaborations based on scholars’

academic age (Zoega et al. 2012; Badar et al. 2014; Petersen 2015; Sugimoto et al. 2016;

Sinatra et al. 2016; Turker and Cavusoglu 2016). In this study, we investigate academic

age-based collaborative strategies that are used by scholars to meet their academic needs

such as publishing articles, maintaining collaboration, and acquiring reputations. Scholars

at different academic ages may exhibit different academic strategies in relation to col-

laborations. For example, young scholars are pursuers, whereas senior scholars are

attractors, when facing new collaborative opportunities (Petersen 2015). Understanding the

underlying laws and patterns of academic-age-aware collaborations can help research

policy makers to design better policies to enhance collaboration, as well as to eliminate

barriers between cross-age scholars. Moreover, by analyzing the academic strategies of

successful scholars, we may predict the future success of a given scholars. Therefore, an

interesting and critical question, to what extent do scholars of different academic age

collaborate with each other, can be inferred from scientific collaboration networks.

Our study is based on two widely used public scholarly data sets; i.e., APS and DBLP,

which contain abundant information of scholars and their publications. Using APS data

sets, we calculate and thoroughly analyze scholars annual productivity growth patterns at

different academic ages during the recent time period, i.e., 2011–2013, and then group

scholars as Beginning, Junior, and Senior scholars. We further examine collaborations

among different academic age groups at the same time period, i.e., 2011–2013. Similarly,

we study collaborations during two preceding time periods, namely, 1999–2001 and

1987–1989 and compare them with the recent time period, i.e., 2011–2013 in order to

understand the dynamics of academic collaborations in different time periods. Through our

comprehensive analysis, we discover that the academic age of scholars in a given time

period follows the long-tail distribution. In addition, we find that there are more young

scholars in Computer Science than that of Physics. The differences in collaborations

between these two fields are also discussed.

Data set and methods

Data set

The data sets used in this work are obtained from APS (Sinha et al. 2015) and DBLP (Ley

2009), which are two largest scholarly data sets that can be employed to analyze and mine

academic social networks. DBLP, which is the most comprehensive open bibliographic

data service in computer science, has over 3 million publications with more than 1.6

million authors and thousands of journals and conferences in the field. APS is comprised of

over 450,000 articles and dates back to 1893 and over a million authors of Physics dis-

cipline. The DBLP data sets can be downloaded freely1 and APS data sets can be

downloaded with permission.2 As can be seen from Fig. 1, scientific collaboration patterns

may vary as time goes on. In order to capture the dynamics of collaborations, our study

1 http://dblp.uni-trier.de/xml/.2 http://journals.aps.org/datasets.

Scientometrics (2017) 112:329–343 331

123

Page 4: Scientific collaboration patterns vary with scholars’ academic agesthealphalab.org/papers/Scientific collaboration patterns... · 2019. 1. 14. · groups, namely, Beginning scholars

contemplates the three time periods 1999–2001, 1987–1989 and 2011–2013, where the

time interval is 12 years. The time interval 12 years is long enough to distinguish two

generations of scholars. Considering a time period 2011–2013, first, we extract all publi-

cations and their authors in APS data sets, and abandon papers with more than 100 authors

based on the fact that such collaboration displays a weak social relationship. For each

author, we both consider the institution and co-authors to solve the problem of name

disambiguation (Ferreira et al. 2012; Tang et al. 2012). During data processing procedure,

the academic ages of authors are calculated regarding with all their past publications. Then

we construct undirected and weighted scientific collaboration network, where nodes rep-

resent scholars and edges denote relationships among scholars. The strength of the edge

between scholars is defined as the number of collaborations during this time period. To

remove isolated nodes, we extract the giant component from the network. Similar data

preparation work has been done to other time periods both in APS and DBLP.

We formally define academic age (AA) as a year when a scholar published his/her first

paper. Every scholar has his/her own academic career. In different career stages, they may

have different academic ages. For example, beginning scholars may have a shorter aca-

demic age and a senior scholar may have a longer academic age because he/she has been

doing research for many years. We show the distribution of different academic ages in

Fig. 2. From this figure, we can see that most scholars are of young academic age. The

interactive version is available at http://thealphalab.org/AcademicAge/. In general, a

scholar who has a higher academic age will be more skilled. Despite the common existence

of academic age, the question how do scholars with different academic ages collaborate

with each other is unaddressed.

Method

In order to split scholars into several groups of different academic ages, we first explore the

whole data set of APS to calculate the average annual productivity, i.e., number of pub-

lications per year, of scholars at each academic age. Specifically, we extract scholars who

have published more than 30 articles to calculate the annual productivity. Figure 3 shows

that the annual productivity of scholars at different ages is consistent. For example, for

scholars whose AA\12, the productivity slowly goes up. If 12�AA\24, scholars tend to

produce more papers across their academic careers. When 24�AA, the annual produc-

tivity slightly declines with the age. We can see from this figure that scholars’ careers are

not stable. Over the course of academic career, the productivity of scholars is dynamic. If

we want to analyze the scientific collaborations over academic ages, such dynamics can not

be ignored. Thus, we split scholars’ academic careers into 3 stages: early career, rising

career, and mature career. According to the three career stages, we split scholars into three

groups, namely, Beginning scholars (early career with AA\12), Junior scholars (rising

career with 12�AA\24), and Senior scholars (mature career with 24�AA).

Since academic age is a kind of human/scholar demographics, it can be better explored

from the perspective of micro-level analysis, i.e., ego network Sinatra et al. (2016). In

order to study how scholars collaborate with others from a local perspective, we explore

the scholars’ ego networks. Conceptually, the ego network of a scholar in academic social

network is defined as a network consisting of a central node (the scholar himself/herself)

together with the scholars connected to the central node and all the links among them.

Examples of beginning, junior, and senior scholar ego networks are depicted in Fig. 4.

Based on scientific ego network properties, we analyze the academic triadic structure in

collaboration. For instance, from Fig. 4a, we can infer same age relationships, i.e.,

332 Scientometrics (2017) 112:329–343

123

Page 5: Scientific collaboration patterns vary with scholars’ academic agesthealphalab.org/papers/Scientific collaboration patterns... · 2019. 1. 14. · groups, namely, Beginning scholars

Fig. 2 Proportions of different academic ages over time. (Color figure online)

0 5 10 15 20 25 30 35 40 45

0.8

1.0

1.2

1.4

1.6

1.8

2.0

2.2

Junior SeniorBeginning

Prod

uctiv

ity

Academic Age

APS

Fig. 3 Scholars’ annual productivity at different academic ages

Beginning Junior Senior(b)(a) (c) (d)

Fig. 4 Examples of scientific ego networks

Scientometrics (2017) 112:329–343 333

123

Page 6: Scientific collaboration patterns vary with scholars’ academic agesthealphalab.org/papers/Scientific collaboration patterns... · 2019. 1. 14. · groups, namely, Beginning scholars

Beginner–Beginner–Beginner (BBB) triads or cross-age relationships, i.e., Beginner–Ju-

nior–Senior (BJS) triads. Similar statistical analysis also can be performed on other aca-

demic triads on the junior ego network (Fig. 4b) and the senior ego network (Fig. 4c).

Figure 4d depicts an example of an interconnected component of authors and their

respective neighborhoods.

We present a comprehensive analysis between scholars’ academic age and their col-

laboration behaviors to observe how scholars of different academic ages collaborate with

each other to meet their specific academic needs. Inspiring by the idea of exploring ego

network structure in previous work (Dong et al. 2014) the following four network prop-

erties are considered. These features can illustrate the network properties of a given

scholar.

Degree Centrality (DC) the number of collaborators upon a scholar i in the network

(Opsahl et al. 2010; Badar et al. 2016), which is calculated as:

DC ¼ Ni ð1Þ

where Ni is the number of neighbors of scholar i.

Neighbor connectivity (NC) the average degree of neighbors of a specific scholar i

(Zhao et al. 2013), which is calculated as:

NC ¼ 1

jNijX

v�Ni

Nv ð2Þ

where Nv is the neighbors of scholar i.

Local clustering coefficient (CC) the measure of how closely a scholar’ neighbors relate

with each other (Newman 2001a; Leskovec et al. 2008), which is calculated as:

CC ¼ 2Ei

NiðNi � 1Þ ð3Þ

where Ei is the number of edges.

EmbeddedNess (EN) the extent that scholars are enmeshed in scientific collaboration

network (Granovetter 1985), which can be calculated as:

EN ¼ 1

jNijX

v�Ni

jNi \ NvjjNi [ Nvj

ð4Þ

where Ni is the neighbor of scholar i and Nv is the neighbor of scholar v.

All the features are extracted and calculated with Python. The network is constructed

with the NetworkX package (Schult and Swart 2008) in Python which is a high-produc-

tivity software for complex networks. After the construction of the network, these features

can be calculated directly based on their definitions. All experiments are performed on a

64-bit Windows based operation system, with a 4-duo and 2.6-GHz Intel Xeon CPU,

128-G Bytes memory.

334 Scientometrics (2017) 112:329–343

123

Page 7: Scientific collaboration patterns vary with scholars’ academic agesthealphalab.org/papers/Scientific collaboration patterns... · 2019. 1. 14. · groups, namely, Beginning scholars

Results

Scholar academic age distribution

We observe that the number of scholars drops sharply with the increase of academic age

(see Fig. 5). For example, using APS data sets, there are more than 33,000 physics

scholars, whose academic ages are \4, which is more than 30% of the total scholars. In

other words, more than 30% of scholars stop publishing papers after the third academic

age. This shows that many scholars have a short academic career. At the academic age of

10, there are merely 2051 scholars, which is 1.8% of the total scholars. We can see from

this figure that the academic age distribution follows the long-tail distribution, where more

than 80% scholars are of young age. Meanwhile, even though DBLP has more scholars

than APS, we observe that there are more senior scholars in Physics than that in Computer

Science. We further consider academic age beyond 29 and find that scholars in Physics

exceed that of Computer Science, which strengthens our observation.

Network characteristics

We explore the extent to which academic age correlated to the four network structure

properties in Fig. 6. Our analysis shows that at the academic age of 35, the DC and NC of

both APS and DBLP reach peak values and then decrease slightly (see Fig. 6a, b). At the

same time, APS has higher values than DBLP at all academic ages. The DC and NC of

DBLP increase slowly before 25 academic age.

Furthermore, we find that both CC and EN values decrease smoothly, and reach the

lowest value around the academic age of 30 (see Fig. 6c, d). These two metrics have also

been used to analyze the correlation between human communication behaviors and ages

(Dong et al. 2014). The results show that both CC and EN increase when people get older.

By comparison, we discover that collaborating with more scholars makes scholar ego

network relatively unstable. Beginning and senior scholars have a relatively stable con-

nections. Similar to the first two network properties, i.e., DC and NC, APS has higher

values of CC and EN than DBLP at all academic ages. In the meantime, the CC and EN of

DBLP drop more quickly than that of APS before academic age 25.

0 10 20 30 40 50

100

1000

10000

Academic Age

Num

bero

fSch

olars

APSDBLP

60000Fig. 5 Number of Scholars atdifferent academic ages. Shownas the distributions of scholars atdifferent academic ages in APSand DBLP

Scientometrics (2017) 112:329–343 335

123

Page 8: Scientific collaboration patterns vary with scholars’ academic agesthealphalab.org/papers/Scientific collaboration patterns... · 2019. 1. 14. · groups, namely, Beginning scholars

Meanwhile, we further calculate the assortativity of beginning, junior and senior

scholars to explore NC feature. In a given network, if the hubs tend to link each other and

the small-degree nodes tend to connect to other small-degree node, the network is assor-

tative (Barabasi 2016). The assortativity feature is similar with the NC feature because they

both explore the relationships between nodes’ degree and their neighbors’ degree. We can

see from Fig. 7 that these networks all are assortative. The scholars in APS are more

assortative than DBLP. Meanwhile, with the increase of academic age, scholars tend to be

more assortative.

(a) (b)

(d)(c)

Fig. 6 Correlations between academic age and network properties. a The degree centrality, b neighborconnectivity, c local clustering coefficient, and d embeddedness dynamics at each academic age in APS andDBLP

APSDBLP

APSDBLP

100 101 102 103100

101

102

APSDBLP

Knn(K

)

100

101

102

Knn(K)

100

101

102

Knn(K

)

K100 101 102 103

K100 101 102 103

K

(a) (b) (c)

Fig. 7 Assortativity of beginning, junior, and senior scholars. The K denotes the degree of target nodes andthe KnnK denotes the degree of their neighbors. a Beginner scholar, b junior scholar, c senior scholar

336 Scientometrics (2017) 112:329–343

123

Page 9: Scientific collaboration patterns vary with scholars’ academic agesthealphalab.org/papers/Scientific collaboration patterns... · 2019. 1. 14. · groups, namely, Beginning scholars

We can see from Figs. 6 and 7 that with the increasing of academic age, the degree

centrality goes up accordingly and the local clustering coefficient of junior scholars is lower

than beginning scholars and senior scholars. The overall trend is that with the increase of

academic age, both the DC and NC goes up slowly since beginning scholars will meet more

junior scholars who might have more collaborators. From a sociological perspective, the

above observation can be explained as the nature of academic society and academic

strategies that scholars adopt different collaboration strategies to meet their academic needs.

Beginning scholars have a few but stable connections which may result from the mentor-

student relationships. For instance, a Ph.D. candidate may frequently collaborate with his/her

mentors as well as lab members. Thus, this might show that early career age scholars who has

lower degree values, are more active in broadening their academic circles, while senior

scholars who have stable embeddedness scores tend to keep steady relationships. In other

words, young scholars have small but stable academic circles. When they extend their

academic circles, their co-authors may collaborate less with each other. Then, after certain

ages, they will meet few new collaborators and the coauthors’ connectivity will go up slowly.

Triads in collaborations

In order to better understand various collaboration triads, we investigate scholars’ col-

laborator components at different academic ages. Collaborators’ academic age distribution

of beginning, junior and senior scholars in APS is shown in Fig. 8a. Our analysis find that

the collaborators of beginning, junior and senior scholars are mainly consists of scholars

who are younger than four academic age. In particular, most collaborators are of three

academic age. Each group’s academic age distributions also show that (a) Beginning

scholars have more collaborators whose academic age falls between 1 and 9 compared to

that of junior and senior scholars; (b) Junior scholars have more collaborators whose

0.00

0.02

0.04

0.06

0.08

0.10

0.120

0 10 20 30 40 50

10 20 30 40 500.00

0.02

0.04

0.06

0.08

0.10

DBLP BeginningDBLP JuniorDBLP Senior

(b)Prop

ortio

n

APS BeginningAPS JuniorAPS Senior

(a)

Academic Age

Fig. 8 Collaborators’ academic age distribution of beginning, junior and senior scholars. Shown ascomponents of beginning, junior and senior scholars’ collaborators in a APS and b DBLP. (Colorfigure online)

Scientometrics (2017) 112:329–343 337

123

Page 10: Scientific collaboration patterns vary with scholars’ academic agesthealphalab.org/papers/Scientific collaboration patterns... · 2019. 1. 14. · groups, namely, Beginning scholars

academic age falls between 10 and 15 compared to that of beginning and senior scholars;

and (c) Senior scholars have more collaborators whose academic age is more than 15.

Similar trends have also been found in DBLP as shown in Fig. 8b. However, in DBLP,

scholars with academic age between 1 and 3 share a large percentage of the distributions.

In the meantime, there are differences compared to APS as follows: (a) Beginning scholars

have more 1–5 academic age collaborators than that of junior and senior scholars; (b) Ju-

nior scholars have more 6–19 academic age collaborators than that of beginning and senior

scholars and (c) Senior scholars have more collaborators whose academic age is more than

20.

In addition, we analyze the percentage of different collaboration triads and depict it

using the pie chart in Fig. 9. The most common collaboration relationship in APS is BBJ

triad, which is 19.1% of the total collaborations. Each of BJS, BBB, BBS and BJJ triad

shares more than 10% of all triads. However, in DBLP, the most common collaboration

relationship is BBB with a share of a large portion, 46.5%, which is almost half of all the

triads. The second and third largest collaboration triads in DBLP are BBJ and BBS,

respectively. Since there are fewer old scholars both in APS and DBLP, there are fewer

SSS and JSS triads. This is probably because of Computer Science is a relatively a young

discipline compared to Physics, and more papers are published by beginning scholars. On

the one hand, Computer Science discipline develops fast, which makes it difficult for older

scholars to catch up with rapidly emerging applications, methodologies, and technologies

of computing. On the other hand, Physics is relatively more stable and senior scholars are

still active.

Based on analysis above, we can get the conclusions that scientific collaborations based

on academic age display a homophily phenomenon. This phenomenon delineates that

people tend to interact with those who are similar to them (Lazarsfeld and Merton 1954). It

has been comprehensively analyzed and verified in online social networks (Lou et al.

2013) and human communication networks (Dong et al. 2014). The academic age

homophily is much stronger for scholars aged from 1 to 5. For example,the BBB triad

shares large collaborations percentage in both Computer Science and Physics (see Fig. 9).

(a) (b)

Fig. 9 Proportions of different triads. Shown as the ratios of different collaboration triads in a APS andb DBLP. (Color figure online)

338 Scientometrics (2017) 112:329–343

123

Page 11: Scientific collaboration patterns vary with scholars’ academic agesthealphalab.org/papers/Scientific collaboration patterns... · 2019. 1. 14. · groups, namely, Beginning scholars

Collaboration frequency between AAs

Previous research has shown that incorporating collaboration times can help to improve the

accuracy of recommending collaborators (Xia et al. 2014). The heat map in Fig. 10 shows

the collaboration frequency distributions among academic ages. The X-axis and Y-axis

represent scholars’ academic ages. The spectrum color, which ranges from blue (low) to

red (high), shows the number of collaboration frequencies among different ages. It shows

that with the increasing of academic age, scholars tends to collaborate with each other

more frequently. In APS, we find that scholars with lower academic age, i.e., no more than

5, usually collaborate with others once, while scholars with high academic age, i.e., more

than 25, collaborate with others more than twice. Due to the rareness of very old scholars in

DBLP, there are some null points between academic age 33 and 43. Meanwhile, scholars in

Computer Science tend to collaborate with each other more frequently from a global

perspective. Another interesting phenomenon is a highlighted diagonal line in DBLP,

which suggests the academic-age homophily phenomenon that scholars tend to collaborate

with others of similar academic age.

Collaboration distribution between AAs

The spectrum color extending from blue (low) to red (high) in Fig. 11 represents the

proportions of collaborations between scholars in every academic age. Both Fig. 11a, b

show that with the increasing of academic age, the proportion of collaborations drops

accordingly. At the same time, there exists a relative highlighted diagonal line both in APS

and DBLP, which also demonstrates the existence of homophily phenomenon. Meanwhile,

connections among beginning scholars are more common to observe, which indicates that

beginning scholars are the main workforce in academia. However, the spectrum color in

DBLP changes more dramatically than that of APS. The possible reason may be that most

scholars have a short academic age. For example, a professor may supervise several doctor

and master students. These students may collaborate more frequently with each other.

After graduation, they may go to work where they will seldom write papers. Thus, mentors

(a) (b)

Fig. 10 Collaboration times among ages. Shown as the collaboration times between any two academic agesin a APS and b DBLP. (Color figure online)

Scientometrics (2017) 112:329–343 339

123

Page 12: Scientific collaboration patterns vary with scholars’ academic agesthealphalab.org/papers/Scientific collaboration patterns... · 2019. 1. 14. · groups, namely, Beginning scholars

will probably recruit new students. In this condition, there are more beginning scholars and

few senior scholars in academia. Therefore, we can conclude that the large percentages of

collaborations happen among young scholars.

(a) (b)

Fig. 11 Collaboration distribution between ages. Shown as the collaboration distribution between any twoacademic ages in a APS and b DBLP. (Color figure online)

(a) (b)

(d)(c)

Fig. 12 Network properties in different time periods. Shown as a the degree centrality, b neighborconnectivity, c local clustering coefficient, and d embeddedness dynamics at each academic age in APS atthree time periods. (Color figure online)

340 Scientometrics (2017) 112:329–343

123

Page 13: Scientific collaboration patterns vary with scholars’ academic agesthealphalab.org/papers/Scientific collaboration patterns... · 2019. 1. 14. · groups, namely, Beginning scholars

Comparisons in different time periods

Figure 12 shows the strength of the four network properties at different academic ages

during 1987–1989, 1999–2001, and 2011–2013 time periods. According to Fig. 3, where

the academic career of a scholar is divided into three stages based on productivity, the time

interval between two stages is 12 years. Thus, we choose 12 years to distinguish different

periods based on the idea that 12 years is long enough to have two generations of scholars.

From Fig. 12a, b, we can observe that the DC and NC increase with academic age for the

three time periods. In 1987–1989 and 1999–2001, the trend of DC and NC are almost the

same, while, in 2011–2013, the values of DC and NC are higher than that of previous two

periods. The advancement of science and communication platform can be accounted as a

reason for more publications with multiple authors. Thus, this result shows that collabo-

rations are getting more popular, which is confirmed with the study in Katz and Martin

(1997).

From Fig. 12c, it can be seen that the CC of the three time periods is practically the

same. Younger and senior scholars have higher CC while junior scholars have lower CC.

Figure 12d presents the EN of the three time periods. In 1987–1989 time periods, the

scholars’ EN value is the highest while it’s the lowest in 2011–2013. This indicates that as

the time goes on scholars tend to have an unstable academic condition, i.e. scholars are not

well embedded in collaboration networks.

Furthermore, the proportions of different collaboration triads in different time periods

are shown in Fig. 13. In 1987–1989 and 1999–2001, BBB triad shares the largest scientific

collaboration proportion, while, in 2011–2013, the largest proportion is BBJ triads. The

share of the top three triads i.e., BBB, BBJ, and BBS in 1987–1989 and 1999–2001 drops

sharply while the share of the top four triads i.e., BBB, BBJ, BBS and BJS in 2011–2013

have no obvious change. Meanwhile, more senior scholars are active in 2011–2013 than

other two periods.

BBB BBJ BBS BJS BJJ BSS JJS JSS JJJ SSS0.00

0.05

0.10

0.15

0.20

0.25

0.30

0.35

Prop

ortio

n

2011-20131999-20011987-1989

Fig. 13 Proportions of different triads in different time periods. Shown as the ratios of differentcollaboration triads of APS in three time periods. (Color figure online)

Scientometrics (2017) 112:329–343 341

123

Page 14: Scientific collaboration patterns vary with scholars’ academic agesthealphalab.org/papers/Scientific collaboration patterns... · 2019. 1. 14. · groups, namely, Beginning scholars

Discussion

In this work, we study the collaborative strategies among scholars’ of different academic

ages during three distinct time periods. We examine academic-age-aware collaborations

from a scholar’s local perspective based on his/her ego network. We discover that the

academic age distribution follows the long-tail distribution. We explore the collaboration

pattern differences between Computer Science and Physics disciplines. The collaboration

patterns in different time periods are investigated. We further uncover that an obvious

homophily phenomenon exists in scientific collaboration on academic age.

We believe that academic age, which is a basic demographic characteristic of scholars,

should be analyzed extensively. Detecting and investigating academic-age-aware collab-

oration behaviors can shed new light on the scientific collaboration mechanisms. Policy

makers may establish new policies to promote scientific collaborations based on academic-

age-aware collaboration behaviors. Meanwhile, understanding the collaboration mecha-

nisms cross academic ages may benefit many practical issues such as team structure

optimization.

Despite the promising results of this work, there is still much room left for future work.

First, academic-age-aware behaviors can be used to explore team science, which might

require additional analysis of more social theories and collaboration patterns. Second,

studying academic-age-aware behavior impact on recommending collaborators and

scholars’ future success can be worth mentioning as future research directions.

Acknowledgements Funding was provided by the Graduate Education Reform Fund of DUT (Grant No.JG2016022).

References

Badar, K., Frantz, T. L., & Jabeen, M. (2016). Research performance and degree centrality in co-authorshipnetworks: The moderating role of homophily. Aslib Journal of Information Management, 68(6),756–771.

Badar, K., Hite, J. M., & Ashraf, N. (2015). Knowledge network centrality, formal rank and researchperformance: Evidence for curvilinear and interaction effects. Scientometrics, 105(3), 1553–1576.

Badar, K., Hite, M. J., & Badir, F. Y. (2014). The moderating roles of academic age and institutional sectoron the relationship between co-authorship network centrality and academic research performance.Aslib Journal of Information Management, 66(1), 38–53.

Barabasi, A.-L. (2016). Network science. Cambridge: Cambridge University Press.Borrett, S. R., Moody, J., & Edelmann, A. (2014). The rise of network ecology: Maps of the topic diversity

and scientific collaboration. Ecological Modelling, 293, 111–127.Cavusoglu, A., & Turker, I. (2014). Patterns of collaboration in four scientific disciplines of the turkish

collaboration network. Physica A: Statistical Mechanics and its Applications, 413, 220–229.Dong, Y., Yang, Y., Tang, J., Yang, Y., & Chawla, N. V. (2014). Inferring user demographics and social

strategies in mobile social networks. In Proceedings of the 20th ACM SIGKDD international con-ference on Knowledge discovery and data mining (pp. 15–24). ACM.

Ferreira, A. A., Goncalves, M. A., & Laender, A. H. (2012). A brief survey of automatic methods for authorname disambiguation. Acm Sigmod Record, 41(2), 15–26.

Granovetter, M. (1985). Economic action and social structure: The problem of embeddedness. AmericanJournal of Sociology, 91(3), 481–510.

Guimera, R., Uzzi, B., Spiro, J., & Amaral, L. A. N. (2005). Team assembly mechanisms determinecollaboration network structure and team performance. Science, 308(5722), 697–702.

Katz, J. S., & Martin, B. R. (1997). What is research collaboration? Research Policy, 26(1), 1–18.Ke, Q., & Ahn, Y.-Y. (2014). Tie strength distribution in scientific collaboration networks. Physical Review

E, 90(3), 032804.

342 Scientometrics (2017) 112:329–343

123

Page 15: Scientific collaboration patterns vary with scholars’ academic agesthealphalab.org/papers/Scientific collaboration patterns... · 2019. 1. 14. · groups, namely, Beginning scholars

King, M. M., Bergstrom, C. T., Correll, S. J., Jacquet, J. & West, J. D. (2016). Men set their own cites high:Gender and self-citation across fields and over time. arXiv preprint arXiv:1607.00376.

Kong, X., Jiang, H., Yang, Z., Xu, Z., Xia, F., & Tolba, A. (2016). Exploiting publication contents andcollaboration networks for collaborator recommendation. PloS ONE, 11(2), e0148492.

Lazarsfeld, P. F., Merton, R. K., et al. (1954). Friendship as a social process: A substantive and method-ological analysis. Freedom and Control in Modern Society, 18(1), 18–66.

Lee, S., & Bozeman, B. (2005). The impact of research collaboration on scientific productivity. SocialStudies of Science, 35(5), 673–702.

Leskovec, J., Backstrom, L., Kumar, R., & Tomkins, A. (2008). Microscopic evolution of social networks.In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and datamining (pp. 462–470). ACM.

Ley, M. (2009). Dblp: Some lessons learned. Proceedings of the VLDB Endowment, 2(2), 1493–1500.Lou, T., Tang, J., Hopcroft, J., Fang, Z., & Ding, X. (2013). Learning to predict reciprocity and triadic

closure in social networks. TKDD, 7(2), 5.Milojevic, S. (2014). Principles of scientific research team formation and evolution. Proceedings of the

National Academy of Sciences, 111(11), 3984–3989.Newman, M. E. (2001a). Clustering and preferential attachment in growing networks. Physical Review E,

64(2), 025102.Newman, M. E. (2001b). Scientific collaboration networks. I. Network construction and fundamental results.

Physical Review E, 64(1), 016131.Newman, M. E. (2001c). The structure of scientific collaboration networks. Proceedings of the National

Academy of Sciences, 98(2), 404–409.Newman, M. E. (2004). Coauthorship networks and patterns of scientific collaboration. Proceedings of the

National Academy of Sciences, 101(suppl 1), 5200–5205.Opsahl, T., Agneessens, F., & Skvoretz, J. (2010). Node centrality in weighted networks: Generalizing

degree and shortest paths. Social Networks, 32(3), 245–251.Ortega, J. L. (2014). Influence of co-authorship networks in the research impact: Ego network analyses from

microsoft academic search. Journal of Informetrics, 8(3), 728–737.Petersen, A. M. (2015). Quantifying the impact of weak, strong, and super ties in scientific careers. Pro-

ceedings of the National Academy of Sciences, 112(34), E4671–E4680.Petersen, A. M., Fortunato, S., Pan, R. K., Kaski, K., Penner, O., Rungi, A., et al. (2014). Reputation and

impact in academic careers. Proceedings of the National Academy of Sciences, 111(43), 15316–15321.Schult, D. A., & Swart, P. (2008). Exploring network structure, dynamics, and function using networkx. In

Proceedings of the 7th python in science conferences (SciPy 2008) (Vol. 2008, pp. 11–16).Sinatra, R., Wang, D., Deville, P., Song, C., & Barabasi, A.-L. (2016). Quantifying the evolution of

individual scientific impact. Science, 354(6312), aaf5239.Sinha, A., Shen, Z., Song, Y., Ma, H., Eide, D., Hsu, B.-J. P., & Wang, K. (2015). An overview of microsoft

academic service (mas) and applications. In Proceedings of the 24th international conference on WorldWide Web (pp. 243–246). ACM.

Sugimoto, C. R., Sugimoto, T. J., Tsou, A., Milojevic, S., & Lariviere, V. (2016). Age stratification andcohort effects in scholarly communication: A study of social sciences. Scientometrics, 109(2),997–1016. doi:10.1007/s11192-016-2087-y.

Tang, J., Fong, A. C., Wang, B., & Zhang, J. (2012). A unified probabilistic framework for name disam-biguation in digital library. IEEE Transactions on Knowledge and Data Engineering, 24(6), 975–987.

Turker, I., & Cavusoglu, A. (2016). Detailing the co-authorship networks in degree coupling, edge weightand academic age perspective. Chaos, Solitons and Fractals, 91, 386–392.

Wuchty, S., Jones, B. F., & Uzzi, B. (2007). The increasing dominance of teams in production of knowl-edge. Science, 316(5827), 1036–1039.

Xia, F., Chen, Z., Wang, W., Li, J., & Yang, L. T. (2014). Mvcwalker: Random walk-based most valuablecollaborators recommendation exploiting academic factors. IEEE Transactions on Emerging Topics inComputing, 2(3), 364–375.

Zhao, Y., Wang, G., Yu, P. S., Liu, S. & Zhang, S. (2013). Inferring social roles and statuses in socialnetworks. In Proceedings of the 19th ACM SIGKDD international conference on Knowledge discoveryand data mining (pp. 695–703). ACM.

Zoega, H., Valdimarsdottir, U. A., & Hernandez-Dıaz, S. (2012). Age, academic performance, and stimulantprescribing for adhd: A nationwide cohort study. Pediatrics, 130(6), 1012–1018.

Scientometrics (2017) 112:329–343 343

123