7
A Decade of Database Research Publications Sherif Sakr National ICT Australia University of New South Wales, Australia [email protected] Mohammad Alomari School of Information Technologies University of Sydney, Australia [email protected] ABSTRACT We analyze the database research publications of four ma- jor core database technology conferences (SIGMOD, VLDB, ICDE, EDBT), two main theoretical database conferences (PODS, ICDT) and three database journals (TODS, VLDB Journal, TKDE) over a period of 10 years (2001 - 2010). Our analysis considers only regular papers as we do not include short papers, demo papers, posters, tutorials or panels into our statistics. We rank the research scholars according to their number of publication in each conference/journal sep- arately and in combined. We also report about the growth in the number of research publications and the size of the research community in the last decade. 1. INTRODUCTION The database management technology has played a vital role in the advancements of the information technology field. Database researchers are one of the key players and main sources to the growth of the database systems. They are playing a foundational role in creating the technological in- frastructure from which database advancements evolve. The impact of research scholars in the community is often mea- sured by their number of publications in top-tier research venues and the number of citations they receive, i.e. how frequently their publications are referenced by other publi- cations (e.g. H-index [?], g-index [?]). In principle, there is a direct relationship between the tier rank of a research venue and its number of citations which is commonly deter- mined as the impact factor [?]. The success of a research scholar in publishing his research results in a top-tier venue increases his chances of having his work being widely re- ceived by his peers in the community and consequently to be more frequently cited by them. In general, achieving an accurate, fair and insightful citation- based analysis is a very challenging task due to the difficulty of parsing and extracting the citation meta data from the research articles. Recently, some online services have been introduced to capture the citation information of research publications (e.g. MS Libra 1 , Google Scholar 2 ). However, the information provided by these services suffer from some anomalies such as: incompleteness and duplication. There- fore, preparing a high quality citation information for a pool of research publications requires an extensive amount of manual labor work. Moreover, citation-based analysis meth- ods tend to consider only the explicit citation relationships as indicated in the reference parts of the articles. In practice, it is impossible for authors of any article (including this one) to cite all the related publications of their work but they are normally only able to cite only a fraction of them. Therefore, the final decision of selecting the set of papers to be refer- enced usually depends on many scientific and non-scientific factors. For example, it has been shown that citations tend to have problems like biased-citation, self-citation, or pos- itive vs. negative citation [?, ?]. One common situation is that article introductions are usually citing related sur- vey papers. Therefore, survey papers usually have citation counts that are many times more than any original work in its corresponding topic (e.g. according to Google Scholar, at the time of writing this paper, the two surveys: [?] has 883 citations and [?] has 2169 citations). Some studies have also shown that different citation choices correspond to different citation impact [?]. Complementary to a previous work which mainly consid- ered ranking the research scholars based on their citation counts [?], in this paper, we focus on ranking the research scholars by the count of their research publications in top- tier venues. We selected a set of top-tier database research venues which are generally considered as the most repre- sentative, influential and prestigious in the database com- munity. In particular, we analyzed the database research publications of four major core database technology confer- ences (SIGMOD, VLDB, ICDE, EDBT), two main theoreti- cal database conferences (PODS, ICDT) and three database journals (TODS, VLDB Journal, IEEE TKDE) over a 10 years period (2001 - 2010). In general, we believe that re- search fields are better presented by their own venues rather than by multi-disciplinary venues. Therefore, we did not include some important conferences (e.g. CIKM, WWW) and journals (e.g. Information Systems) in the scope of this study. In principle, some could argue that the number of publica- tions may have become a less insightful or less significant 1 http://academic.research.microsoft.com/ 2 http://scholar.google.com. arXiv:1102.1064v1 [cs.DL] 5 Feb 2011

A Decade of Database Research Publications · our selected venues (6 conferences and 3 journals) is compat-ible with the fact that database researchers - and computer scientists in

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: A Decade of Database Research Publications · our selected venues (6 conferences and 3 journals) is compat-ible with the fact that database researchers - and computer scientists in

A Decade of Database Research Publications

Sherif SakrNational ICT Australia

University of New South Wales, [email protected]

Mohammad AlomariSchool of Information Technologies

University of Sydney, [email protected]

ABSTRACTWe analyze the database research publications of four ma-jor core database technology conferences (SIGMOD, VLDB,ICDE, EDBT), two main theoretical database conferences(PODS, ICDT) and three database journals (TODS, VLDBJournal, TKDE) over a period of 10 years (2001 - 2010). Ouranalysis considers only regular papers as we do not includeshort papers, demo papers, posters, tutorials or panels intoour statistics. We rank the research scholars according totheir number of publication in each conference/journal sep-arately and in combined. We also report about the growthin the number of research publications and the size of theresearch community in the last decade.

1. INTRODUCTIONThe database management technology has played a vitalrole in the advancements of the information technology field.Database researchers are one of the key players and mainsources to the growth of the database systems. They areplaying a foundational role in creating the technological in-frastructure from which database advancements evolve. Theimpact of research scholars in the community is often mea-sured by their number of publications in top-tier researchvenues and the number of citations they receive, i.e. howfrequently their publications are referenced by other publi-cations (e.g. H-index [?], g-index [?]). In principle, thereis a direct relationship between the tier rank of a researchvenue and its number of citations which is commonly deter-mined as the impact factor [?]. The success of a researchscholar in publishing his research results in a top-tier venueincreases his chances of having his work being widely re-ceived by his peers in the community and consequently tobe more frequently cited by them.

In general, achieving an accurate, fair and insightful citation-based analysis is a very challenging task due to the difficultyof parsing and extracting the citation meta data from theresearch articles. Recently, some online services have beenintroduced to capture the citation information of research

publications (e.g. MS Libra1, Google Scholar2). However,the information provided by these services suffer from someanomalies such as: incompleteness and duplication. There-fore, preparing a high quality citation information for a poolof research publications requires an extensive amount ofmanual labor work. Moreover, citation-based analysis meth-ods tend to consider only the explicit citation relationshipsas indicated in the reference parts of the articles. In practice,it is impossible for authors of any article (including this one)to cite all the related publications of their work but they arenormally only able to cite only a fraction of them. Therefore,the final decision of selecting the set of papers to be refer-enced usually depends on many scientific and non-scientificfactors. For example, it has been shown that citations tendto have problems like biased-citation, self-citation, or pos-itive vs. negative citation [?, ?]. One common situationis that article introductions are usually citing related sur-vey papers. Therefore, survey papers usually have citationcounts that are many times more than any original work inits corresponding topic (e.g. according to Google Scholar, atthe time of writing this paper, the two surveys: [?] has 883citations and [?] has 2169 citations). Some studies have alsoshown that different citation choices correspond to differentcitation impact [?].

Complementary to a previous work which mainly consid-ered ranking the research scholars based on their citationcounts [?], in this paper, we focus on ranking the researchscholars by the count of their research publications in top-tier venues. We selected a set of top-tier database researchvenues which are generally considered as the most repre-sentative, influential and prestigious in the database com-munity. In particular, we analyzed the database researchpublications of four major core database technology confer-ences (SIGMOD, VLDB, ICDE, EDBT), two main theoreti-cal database conferences (PODS, ICDT) and three databasejournals (TODS, VLDB Journal, IEEE TKDE) over a 10years period (2001 - 2010). In general, we believe that re-search fields are better presented by their own venues ratherthan by multi-disciplinary venues. Therefore, we did notinclude some important conferences (e.g. CIKM, WWW)and journals (e.g. Information Systems) in the scope of thisstudy.

In principle, some could argue that the number of publica-tions may have become a less insightful or less significant

1http://academic.research.microsoft.com/2http://scholar.google.com.

arX

iv:1

102.

1064

v1 [

cs.D

L]

5 F

eb 2

011

Page 2: A Decade of Database Research Publications · our selected venues (6 conferences and 3 journals) is compat-ible with the fact that database researchers - and computer scientists in

metric due to the explosion of the number of conferencesand journals in recent years [?]. Therefore, to remedy thisargument, we considered only top-tier venues which are well-known with their very low acceptance rates. These presti-gious venues are conducting highly selective review processesthat mainly aims of ensuring that they are turning out highquality papers. Hence, these papers are usually expectedto attract considerable attention (and citations) from otherresearchers in the community [?]. In fact, the distribution ofour selected venues (6 conferences and 3 journals) is compat-ible with the fact that database researchers - and computerscientists in general - are considering prestigious conferencesas favorite tools for presenting original research work in con-trast to the general case of many other scientific disciplineswhere journal papers are routinely considered to be supe-rior than conference papers [?, ?]. For example, it has beenshown that the two top database conferences (SIGMOD andVLDB) receive many more citations per paper than the twotop database journals (TODS and VLDB J.) [?]. In prac-tice, the general culture in the computer science communityis that journal papers are used to present deeper versions ofpapers that already have been presented at conferences. Oneof the main reasons behind this is that the review process ofjournal papers are usually very long. The turnaround time(the interval between the submission date of a manuscriptand the date of having the editorial decision) for confer-ences is often less than a third of that of journals [?]. Sincethe field of computer science research tends to be fast paced,conferences provide a great chance for timestamping the lat-est research findings earlier which allows the knowledge tobe publicly shared more rapidly.

In general, we are witnessing a continuous growth in thedatabase field. That is mainly due to the continuous intro-duction of new application domains (e.g. web applications,mobile applications, cloud computing, sensor networks) withvarying features and requirements on their data manage-ment aspects. In practice, data has become mobile, flexible,mirrored in a variety of logical and physical forms, evolv-ing, being concurrently modified and replicated, dynami-cally generated and later reintegrated in very large reposito-ries for further analysis and processing [?]. Therefore, thereare many more researchers are entering the field to tacklethese challenges and hence more research papers are beingpublished. In this paper, we also study the growth rate onthe size of contributing research community and the numberof research publications in the last decade.

The input data of this study has been extracted from theXML records of the famous DBLP computer science bibli-ography3. Our analysis considers only regular papers as wedo not include short papers, demo papers, posters, tutorialsor panels into our statistics. We made the detailed resultsof our study accessible on the web4

2. STUDY RESULTS2.1 Top Publishers of Database Research VenuesAs we previously stated, in this study, we focus on measuringthe number of publications in top-tier publication venuesas one of the main indicators to evaluate the impact of aresearch scholar in the community and the quality of his

3http://dblp.uni-trier.de/xml/4http://www.cse.unsw.edu.au/∼ssakr/DBStatistics/index.html

research production. In this paper, we present the mostimportant results of our study. For full detailed results, werefer the reader to the web page of this study.

Figures 1, 2, 3 illustrate the top publishers of the databaseresearch venues during the period between 2001 and 2010.Figure 1 represents the top publishers of the core databasetechnology conferences: VLDB (Figure 1(a)), SIGMOD (Fig-ure 1(b)), ICDE (Figure 1(c)) and EDBT (Figure 1(d)).Figure 2 represents the top publishers of the theoreticaldatabase conferences: PODS (Figure 2(a)) and ICDT (Fig-ure 2(b)). Figure 3 represents the top publishers of the maindatabase journals: VLDB journal (Figure 3(a)), TODS jour-nal (Figure 3(b)) and TKDE (Figure 3(c)). The researchscholars in these figures can be indicated with one of thefollowing two symbols:

• The (+) symbol indicates that the research scholarappears on the correspondingly top publishers list ofthe same research venue for the former decade (1991 -2000).

• The (*) symbol indicates that the research scholar ap-pears on the ultimate top publishers list of the sameresearch venue in all of its editions since its origin.

For example, in Figure 1(a), Divesh Srivastava and H. V.Jagadish are indicated that they appear in the top publish-ers of the VLDB conference since its origin (1975 - 2010).However, only H. V. Jagadish is indicated that he appearson top publishers list of the VLDB conference on the formerdecade. Figure 4 illustrates aggregate lists of the top pub-lishers for database research venues according to their focus:core database technology conference (Figure 4(a)), theoreti-cal database conferences (Figure 4(b)) and database journals(Figure 4(c)). Several remarks can be observed from the re-ported results for these database research venues. Some keyremarks are given as follows:

• There are distinctly 42 (non-distinctly 72) researchscholars in the top publishers lists of the four coredatabase technology conferences. There are distinctly34 (non-distinctly 41) research scholars in the top pub-lishers lists of the three main database journals. Incombination, there are 63 distinct research scholars onthe seven venues. These results show a clear overlapbetween the list of these top database research venues.

• Three research scholars appear on the top publish-ers list of all core database technology conferences.Namely, Philip S. Yu, Nick Koudas and Yufei Tao.In addition, Philip S. Yu appears on the top publish-ers lists of the VLDB journal and TKDE. Yufei Taoappears on the lists of the TODS and TKDE whileNick Koudas appears only on the list of TODS.

• Six research scholars appear on the top publishers listof three (out of four) core database technology con-ferences. Namely, Divesh Srivastava, Beng Chin Ooi,Surajit Chaudhuri, Jiawei Han, Jeffrey Xu Yu and H.V. Jagadish. In addition, Beng Chin Ooi appears onthe lists of the VLDB Journal and TKDE. Jiawei Hanappears on the top list of TKDE. Jeffrey Xu Yu ap-pears on the top list of the VLDB Journal. SurajitChaudhuri and H. V. Jagadish appears on the top listof TODS.

Page 3: A Decade of Database Research Publications · our selected venues (6 conferences and 3 journals) is compat-ible with the fact that database researchers - and computer scientists in

*Div

esh

Sriv

asta

va

*Jia

wei

Han

*Nic

k K

ouda

s

*+H

. V. J

agad

ish

*Phi

lip S

. Yu

*+Su

rajit

Cha

udhu

ri

*Ben

g C

hin

Ooi

*+Je

ffrey

F. N

augh

ton

Dan

Suc

iu

*+R

aghu

Ram

akris

hnan

Wen

fei F

an

*+H

ecto

r Gar

cia-

Mol

ina

Yufe

i Tao

Chr

istia

n S.

Jen

sen

Div

yaka

nt A

graw

al

*Ger

hard

Wei

kum

Jeffr

ey X

u Yu

Sam

uel M

adde

n

0

5

10

15

20

25

Num

ber o

f Pap

ers

(a) VLDB

*Sur

ajit

Cha

udhu

ri

*Ben

g C

hin

Ooi

*+D

ives

h Sr

ivas

tava

*Kia

n-Le

e Ta

n

*+Je

ffrey

F. N

augh

ton

Anth

ony

K. H

. Tun

g

*Dim

itris

Pap

adia

s

*Nic

k K

ouda

s

*Alo

n Y.

Hal

evy

Min

os N

. Gar

ofal

akis

*+R

. Ram

akris

hnan

Yufe

i Tao

*+Ph

ilip

S. Y

u

Sam

uel M

adde

n

*+H

. V. J

agad

ish

Hai

xun

Wan

g

*Jia

wei

Han

Joha

nnes

Geh

rke

Rag

hav

Kau

shik

*+R

ajee

v R

asto

gi

0

5

10

15

20

25

Num

ber o

f Pap

ers

(b) SIGMOD

*Div

esh

Sriv

asta

va

*Ben

g C

hin

Ooi

*+Ph

ilip

S. Y

u

*Nic

k K

ouda

s

*Sur

ajit

Cha

udhu

ri

*+Ji

awei

Han

*Jef

frey

Xu

Yu

*Kia

n-Le

e Ta

n

*Xue

min

Lin

*Yuf

ei T

ao

Anth

ony

K. H

. Tun

g

Jun

Yang

*Hai

xun

Wan

g

AnH

ai D

oan

Dim

itris

Pap

adia

s

*Sta

nley

B. Z

doni

k

Feife

i Li

Moh

amed

F. M

okbe

l0

5

10

15

20

25

Num

ber o

f Pap

ers

(c) ICDE

*Jef

frey

Xu

Yu

*Phi

lip S

. Yu

*Elk

e A.

Run

dens

tein

er

*+H

. V. J

agad

ish

*Jia

n Pe

i

*Lei

Che

n

*Nic

k K

ouda

s

*Wan

g-C

hien

Lee

*Yuf

ei T

ao

*Am

r El A

bbad

i

Bai

hua

Zhen

g

*Div

yaka

nt A

graw

al

Foto

N. A

frat

i

Thom

as S

eidl

*Tim

os K

. Sel

lis

Um

eshw

ar D

ayal

0

2

4

6

8

10

Num

ber o

f Pap

ers

(d) EDBT

Figure 1: Top Publishers in Major Core Database Technology Conferences

*+Le

onid

Lib

kin

*Geo

rg G

ottlo

b

*Pho

kion

G. K

olai

tis

*Ron

ald

Fagi

n

*+D

an S

uciu

Chr

isto

ph K

och

*+Se

rge

Abite

boul

Wan

g C

hiew

Tan

Wen

fei F

an

*+Ye

hosh

ua S

agiv

Gra

ham

Cor

mod

e

Mar

celo

Are

nas

S. M

uthu

kris

hnan

*+To

va M

ilo

Ben

ny K

imel

feld

Joha

nnes

Geh

rke

Flor

is G

eert

s

Ke

Yi

0

2

4

6

8

10

12

14

16

Num

ber o

f Pap

ers

(a) PODS

*Alin

Deu

tsch

*Ala

n N

ash

*Sar

a C

ohen

*+Vi

ctor

Via

nu

*Fra

nk N

even

*+Ja

n Va

n de

n B

ussc

he

*+Le

onid

Lib

kin

Mar

celo

Are

nas

*Pho

kion

G. K

olai

tis

Ren

ée J

. Mill

er

Stijn

Van

sum

mer

en

Wim

Mar

tens

*Yeh

oshu

a Sa

giv

Ben

ny K

imel

feld

Chr

isto

ph K

och

*Dirk

Van

Guc

ht

*Gös

ta G

rahn

e

*Mos

he Y

. Var

di

Raj

eev

Mot

wan

i

*Ron

ald

Fagi

n

Wer

ner N

utt0

1

2

3

4

5

6

7

Num

ber o

f Pap

ers

(b) ICDT

Figure 2: Top Publishers in Major Theoretical Database Conferences

Page 4: A Decade of Database Research Publications · our selected venues (6 conferences and 3 journals) is compat-ible with the fact that database researchers - and computer scientists in

*Alo

n Y

. H

ale

vy

*+B

en

g C

hin

Oo

i

*Ch

risti

an

S. Jen

sen

*Jen

nif

er

Wid

om

Ch

ris. M

. Jerm

ain

e

Dim

itri

os G

un

op

ulo

s

*+D

on

ald

Ko

ssm

an

n

Geo

rge K

ollio

s

*+G

erh

ard

Weik

um

*Jeff

rey X

u Y

u

*Min

g-S

yan

Ch

en

*Min

os N

. G

aro

fala

kis

Om

ar

Ben

jello

un

*+P

hilip

S. Y

u

Vassilis

J. T

so

tras

Xia

ofa

ng

Zh

ou

0

2

4

6

8

10

Nu

mb

er

of

Pa

pe

rs

(a) VLDB J.

*Dim

itri

s P

ap

ad

ias

*Han

an

Sam

et

*Min

os N

. G

aro

fala

kis

*Yu

fei T

ao

Alin

Do

bra

Nic

k K

ou

das

Nik

os M

am

ou

lis

*Su

rajit

Ch

au

dh

uri

Wan

g C

hie

w T

an

*H. V

. Jag

ad

ish

Ph

okio

n G

. K

ola

itis

*Ro

nald

Fag

in

*Vic

tor

Via

nu

0

1

2

3

4

5

6

7

8

Nu

mb

er

of

Pa

pe

rs

(b) TODS

*+M

ing

-Syan

Ch

en

*+P

hilip

S. Y

u

*+Jia

wei H

an

*Dim

itri

s P

ap

ad

ias

*Qia

ng

Yan

g

*Yu

fei T

ao

*Ch

aru

C. A

gg

arw

al

*Nik

os M

am

ou

lis

*Jia

n P

ei

Lei C

hen

*Ben

g C

hin

Oo

i

*+E

lisa B

ert

ino

0

5

10

15

20

25

30

Nu

mb

er

of

Pa

pe

rs

(c) TKDE

Figure 3: Top Publishers in Major Database Technology Journals

Div

esh

Sri

vasta

va

Nic

k K

ou

das

Ph

ilip

S. Y

u

Su

rajit

Ch

au

dh

uri

Ben

g C

hin

Oo

i

Jia

wei H

an

Yu

fei T

ao

H. V

. Jag

ad

ish

Kia

n-L

ee T

an

Jeff

rey X

u Y

u

An

tho

ny K

. H

. T

un

g

Jeff

rey F

. N

au

gh

ton

Dim

itri

s P

ap

ad

ias

Div

yakan

t A

gra

wal

Min

os N

. G

aro

fala

kis

Xu

em

in L

in

Sam

uel M

ad

den

Rag

hu

Ram

akri

sh

nan

Haix

un

Wan

g

Jo

sep

h M

. H

ellers

tein

0

5

10

15

20

25

30

35

40

45

50

55

60

65

70

Nu

mb

er

of

Pa

pe

rs

(a) Core DB: VLDB + SIGMOD +ICDE + EDBT

Leo

nid

Lib

kin

Ph

okio

n G

. K

ola

itis

Geo

rg G

ott

lob

Ro

nald

Fag

in

Ch

risto

ph

Ko

ch

Dan

Su

ciu

Yeh

osh

ua S

ag

iv

Marc

elo

Are

nas

Serg

e A

bit

eb

ou

l

Wen

fei F

an

Ala

n N

ash

Alin

Deu

tsch

Ben

ny K

imelf

eld

Sara

Co

hen

To

va M

ilo

Wan

g C

hie

w T

an

0

2

4

6

8

10

12

14

16

18

20

Nu

mb

er

of

Pa

pe

rs

(b) Theoretical DB: PODS + ICDT

Min

g-S

yan

Ch

en

Ph

ilip

S. Y

u

Dim

itri

s P

ap

ad

ias

Yu

fei T

ao

Nik

os M

am

ou

lis

Jia

wei H

an

Jia

n P

ei

Ah

med

K. E

lmag

arm

id

Ch

aru

C. A

gg

arw

al

Qia

ng

Yan

g

Jeff

rey X

u Y

u

Min

os N

. G

aro

fala

kis

Elisa B

ert

ino

Su

rajit

Ch

au

dh

uri

Nic

k K

ou

das

Dik

Lu

n L

ee

Dim

itri

os G

un

op

ulo

s

H. V

. Jag

ad

ish

Walid

G. A

ref

Xia

ofa

ng

Zh

ou

Xu

em

in L

in

0

5

10

15

20

25

30

35

Nu

mb

er

of

Pa

pe

rs

(c) DB Journals: VLDB J. + TODS +TKDE

Figure 4: Aggregate Lists of Top Publishers for Database Research Venues

Page 5: A Decade of Database Research Publications · our selected venues (6 conferences and 3 journals) is compat-ible with the fact that database researchers - and computer scientists in

2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010

20

40

60

80

100

120

140

Num

ber

of P

ublic

atio

ns

Year

ICDEVLDBSigmodEDBTPODSICDT

(a) Conferences

2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 20100

20

40

60

80

100

120

140

Num

ber

of P

ublic

atio

ns

Year

VLDBJ.TODSTKDE

(b) Journals

Figure 5: Growth in Number of Publications

2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 20100

50

100

150

200

250

300

350

400

450

Num

ber

of A

utho

rs

Year

ICDESIGMODEDBTPODSICDT

(a) Conferences

2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 20100

50

100

150

200

250

300

350

400

450

Num

ber o

f Aut

hors

Year

VLDBJTODSTKDE

(b) Journals

Figure 6: Growth in Number of Authors

Page 6: A Decade of Database Research Publications · our selected venues (6 conferences and 3 journals) is compat-ible with the fact that database researchers - and computer scientists in

• Eight research scholars appear on the top publisherslist of two core database technology conferences. Namely,Kian-Lee Tan, Anthony K. H. Tung, Haixun Wang,Dimitris Papadias, Jeffrey F. Naughton, Raghu Ra-makrishnan, Divyakant Agrawal and Samuel Madden.In addition, Dimitris Papadias appears on the top pub-lishers lists of TODS and TKDE.

• There are 32 distinct research scholars in the top pub-lishers list of the two theoretical database conferences(PODS and ICDT). Seven research scholars appear onthe lists of both conferences. Namely, Leonid Libkin,Marcelo Arenas, Phokion G. Kolaitis, Yehoshua Sagiv,Benny Kimelfeld, Christoph Koch and Ronald Fagin.

• Seven research scholars have joint appearance on thetop publishers list of at least one of the theoreticaldatabase conferences in addition to another appear-ance in at least one the top publishers list of a coredatabase technology conference or a main databasejournal. Namely, Victor Vianu (ICDT, TODS), PhokionG. Kolaitis (PODS / ICDT, TODS), Ronald Fagin(PODS / ICDT, TODS), Johannes Gehrke (PODS,SIGMOD), Wang Chiew Tan (PODS, TODS), DanSuciu (PODS, VLDB) and Wenfei Fan (PODS, VLDB).

• Ming-Syan Chen has the highest total number of pub-lications in the major database journals in one year.In 2008, he has published 9 papers (5 papers in TKDEand 4 papers in VLDB Journal).

• Philip S. Yu has the highest total number of publica-tions in the major database conferences in one year. In2009, he has published 13 papers (6 papers in VLDB,5 papers in ICDE and 2 papers in SIGMOD).

• Divesh Srivastava is the top publisher in the aggre-gate list of all core database technology conferences(Figure 4(a)). He published 67 papers in total with anaverage of about 7 papers per year. On the other side,he published only 5 papers in the main database jour-nals. Therefore, he does not appear in the aggregatelist of the main database journals (Figure 4(c)). Tenresearch scholars appear in both of the aggregate listsfor top publishers on core database technology con-ferences and database journals. Namely, Philip S. Yu(with total of 88 papers), Nick Koudas (71 papers), Ji-awei Han (71 papers), Surajit Chaudhuri (69 papers),Yufei Tao (69 papers), H. V. Jagadish (62 papers),Dimitris Papadias (60 papers), Jeffrey Xu Yu (53 pa-pers), Minos N. Garofalakis (45 papers) and XueminLin (42 papers).

• Yannis Papakonstantinou and Dan Suciu had at leastone paper in each of the studied nine major databasevenues in the last decade.

• Table 1 shows the most important co-authorship re-lations between research scholars in the top lists ofthe database research venues. For example, Yufei Taoand Dimitris Papadias have participated in the co-authorship of 34 regular paper in the different databaseresearch venues. The degree column (Deg.) indicatesthe number of the research scholars participating inthe relationship.

2.2 The Growth in number of Publications andDatabase Community Size

The topics of the database field is continuously growing.Therefore, there are more researchers who are entering the

Deg. Authors # Pub.2 Yufei Tao and Dimitris Papadias 342 Divesh Srivastava and Nick Koudas 332 Divyakant Agrawal and Amr El Abbadi 302 Vivek R. Narasayya and Surajit Chaudhuri 222 Beng Chin Ooi and Anthony K. H. Tung 162 Haixun Wang and Philip S. Yu 162 Xuemin Lin and Wei Wang 162 Xuemin Lin and Jeffrey Xu Yu 143 B. Gedik, P. S. Yu and K. Wu 93 D. Agrawal, A. El Abbadi and A. Metwally 7

Table 1: Top Co-authorship Relationships

research community and more research papers are beingpublished [?]. In our study, we determined the numberof regular publications for all of our considered publicationvenues for the ten years period of 2001 - 2010. Moreover,we determined the number of unique authors for the publi-cations of each venue as a measure of its contributing com-munity size. Figure 5 presents an overview of the growth inthe number of publications in the database research venueswhile Figure 6 presents an overview of the growth in thenumber of unique authors (participating community size).Combining the results of both figures show that the numberof research publications and unique authors in core databasetechnology conferences and database journals has on averagenearly doubled in number. On the contrary for the theoret-ical database conference (PODS and ICDT), there was noclear increase either on the number of publications nor onthe number of authors. They kept having an average ofaround 30 papers and 75 authors per conference over thewhole decade.

In principle, the number of regular research publications forcore database technology conferences cannot continue grow-ing in proportion to the size of the community. Therefore,most of the conference have introduced other forms of pub-lications such as: posters, short papers and demo papers inorder to provide a chance for a wider part of the communityto present their work and to continue attracting and focusingthe researchers to participate in a small set of top confer-ences as there are always limits on the number of conferencesthat researchers can attend. For example, the 2002 editionof the ICDE conference first introduced the acceptance ofdemo papers, the 2003 edition introduced the acceptance ofposter papers and the 2009 edition introduced the accep-tance of 4 pages short papers. We believe that having morejournal papers could be a good solution to absorb this con-tinuous increase of research publications without the need toincrease the number of conferences or to increase the numberof accepted papers in the current conferences.

One of the main reasons behind the increase in the numberof publications in the database community is the continuousintroduction of new research challenges which is relevant tothe scope of the community. For example, XML has startedto be introduced as a hot research topic for the databaseresearch community in the early of the last decade. Moroet al [?] referenced a list of more than 100 publications ina survey paper that provides an overview of some of thework that have been done in different aspects for XML datamanagement. Recently, the topic of large scale data man-agement on cloud computing and parallel data processing

Page 7: A Decade of Database Research Publications · our selected venues (6 conferences and 3 journals) is compat-ible with the fact that database researchers - and computer scientists in

(e.g. MapReduce) have been introduced and they attracta lot of interest from the database research community [?].As a consequence, a new series of research conferences, theACM Symposium on Cloud Computing, has been started in2010 [?]. This series is co-sponsored by the ACM SpecialInterest Groups on Management of Data (ACM SIGMOD)and on Operating Systems (ACM SIGOPS). The conferencewill be held in conjunction with ACM SIGMOD and ACMSOSP Conferences in alternate years.

3. CONCLUSIONSResearch is a competitive endeavor. Research scholars usu-ally have multiple goals to achieve and it is therefore reason-able that their impact must be judged by multiple criteria.We believe that ranking of research scholars based on thecount of their publications in top-tier research venues can bean insightful indicator in a comprehensive assessment pro-cess. Other important factors such as: invitations to pro-gram committees of prestigious conferences, membership oneditorial boards of high quality journals, grant funding andawards can be also good indicators for evaluating the impactof research scholars.

In this paper, we presented a detailed study for the publica-tions of 6 major database conferences and 3 major databasejournals in the period between 2001 and 2010. The resultsof our study reveals the fact that the number of researchpublications pear year and the community size has nearlydoubled through the last decade. The results also show aconsiderable overlap between the top publishers lists of thecore database technology conferences and the database jour-nals. The results are also compatible with the fact that theresearchers in the database community tend to prefer pub-lishing their work in prestigious conferences rather than inmajor database journals. The average publication rate fortop publishers in conference venues highly exceed their av-erage publication rate in the major database journals. Inprinciple, we believe that conference publications will re-main as an attractive way to gain a quick publicity for newresearch findings. However, the number of conferences orthe number of accepted publications per conference can notcontinue increasing as this will limit the value of these venuesgradually. Therefore, we believe that journal papers will re-main as the best way to document and archive significantpieces of research which can not fit within the 12-page limitof conferences. The community should continue pushing to-wards achieving the switch to the culture of highly evaluat-ing the journal papers over the conference papers [?]. Oneof the valuable trials in this direction is the introduction ofthe The Proceedings of the VLDB Endowment (PVLDB)5

which aims of providing journal-like experience to authorsof the VLDB submissions.

5http://www.vldb.org/pvldb/