49
Lotkaian Informetrics and applications to social networks L. Egghe Chief Librarian Hasselt University Professor Antwerp University Editor-in-Chief “Journal of Informetrics” [email protected]

Lotkaian Informetrics and applications to social networks L. Egghe Chief Librarian Hasselt University Professor Antwerp University Editor-in-Chief “Journal

Embed Size (px)

Citation preview

Page 1: Lotkaian Informetrics and applications to social networks L. Egghe Chief Librarian Hasselt University Professor Antwerp University Editor-in-Chief “Journal

Lotkaian Informetrics and applications to social networksL. Egghe

Chief Librarian Hasselt UniversityProfessor Antwerp UniversityEditor-in-Chief “Journal of Informetrics”

[email protected]

Page 2: Lotkaian Informetrics and applications to social networks L. Egghe Chief Librarian Hasselt University Professor Antwerp University Editor-in-Chief “Journal

1-dimensional informetrics

# authors in a field # journals in a field # articles in a field # references (or citations) in a field # borrowings in a library # websites, hosts, … # web citations to a paper # in- (or out-) links to/from a website # downloads of an article

Page 3: Lotkaian Informetrics and applications to social networks L. Egghe Chief Librarian Hasselt University Professor Antwerp University Editor-in-Chief “Journal

Growth

Exponential growth

All “new” fields grow exponentially

Otherwise there is S-shaped growth.

Page 4: Lotkaian Informetrics and applications to social networks L. Egghe Chief Librarian Hasselt University Professor Antwerp University Editor-in-Chief “Journal

1992

30

1993 1994 1995 1996 1997 1998 1999 2000 2001

25

20

15

10

5

0

# web servers versus time

Page 5: Lotkaian Informetrics and applications to social networks L. Egghe Chief Librarian Hasselt University Professor Antwerp University Editor-in-Chief “Journal
Page 6: Lotkaian Informetrics and applications to social networks L. Egghe Chief Librarian Hasselt University Professor Antwerp University Editor-in-Chief “Journal
Page 7: Lotkaian Informetrics and applications to social networks L. Egghe Chief Librarian Hasselt University Professor Antwerp University Editor-in-Chief “Journal

2- dimensional informetrics

# authors in a field (sources) # articles in a field (items) + indicating which author has written which

papersS = Set of sourcesI = set of items

IPP = Information Production Process

Page 8: Lotkaian Informetrics and applications to social networks L. Egghe Chief Librarian Hasselt University Professor Antwerp University Editor-in-Chief “Journal

Examples of IPPs

S F IAuthors Articles

Journals Articles

Articles Citations (to/from)

Books Borrowings

Words (= types) Use of words in a text (= tokens)

Web sites Hyperlinks (in-/out-)

Web sites Web pages

Cities/villages Inhabitants

Employees Their production

Employees Their salaries

Page 9: Lotkaian Informetrics and applications to social networks L. Egghe Chief Librarian Hasselt University Professor Antwerp University Editor-in-Chief “Journal

1. = size-frequency function:

for n = 1,2,3,…

= # sources with n items

2. = rank-frequency function:

for r = 1,2,3,…

= # items in the source on rank r(sources are ranked in decreasing order of number of items they have)

Page 10: Lotkaian Informetrics and applications to social networks L. Egghe Chief Librarian Hasselt University Professor Antwerp University Editor-in-Chief “Journal

Continuous model

Source densities

Item densities

Page 11: Lotkaian Informetrics and applications to social networks L. Egghe Chief Librarian Hasselt University Professor Antwerp University Editor-in-Chief “Journal

Lotkaian Informetrics

The law of Lotka and the law of Zipf

Lotka (1926)

. The value is a turning point in informetrics (see further).

Page 12: Lotkaian Informetrics and applications to social networks L. Egghe Chief Librarian Hasselt University Professor Antwerp University Editor-in-Chief “Journal

Lotka’s law is equivalent with Zipf’s law :

Linguistics

Zipf’s law in econometrics is called

Pareto’s law

Page 13: Lotkaian Informetrics and applications to social networks L. Egghe Chief Librarian Hasselt University Professor Antwerp University Editor-in-Chief “Journal

Dependence of G on . Existence of a Groos droop if .

Page 14: Lotkaian Informetrics and applications to social networks L. Egghe Chief Librarian Hasselt University Professor Antwerp University Editor-in-Chief “Journal

log-log scale

= decreasing straight line with slope =

Page 15: Lotkaian Informetrics and applications to social networks L. Egghe Chief Librarian Hasselt University Professor Antwerp University Editor-in-Chief “Journal

Rank-frequency distributions for websites

Page 16: Lotkaian Informetrics and applications to social networks L. Egghe Chief Librarian Hasselt University Professor Antwerp University Editor-in-Chief “Journal

The scale-free property

f : scale-free

such that

Page 17: Lotkaian Informetrics and applications to social networks L. Egghe Chief Librarian Hasselt University Professor Antwerp University Editor-in-Chief “Journal

Theorem (i)⇔(ii):

(i) f is continuous, decreasing and scale-free

(ii) f is a decreasing power function:

such that

i.e. Lotka’s law

Page 18: Lotkaian Informetrics and applications to social networks L. Egghe Chief Librarian Hasselt University Professor Antwerp University Editor-in-Chief “Journal

Explanation of Lotka’s law based on exponential growth of sources and items (Naranan (1970)) and an interpretation of Lotkaian IPPs as self-similar fractals

(Egghe (2005))

Fractals and fractal dimension

Page 19: Lotkaian Informetrics and applications to social networks L. Egghe Chief Librarian Hasselt University Professor Antwerp University Editor-in-Chief “Journal

1. Divide a line piece into 3 equal parts

⇒ we need 3=31 line pieces of this length to cover the original line piece:3 ⇒ need 3=31 ⇒ dim=1

Page 20: Lotkaian Informetrics and applications to social networks L. Egghe Chief Librarian Hasselt University Professor Antwerp University Editor-in-Chief “Journal

2. Divide the sides of a square into 3 equal parts ⇒ we need 9=32 squares with this side length to cover the original square:3 ⇒ need 9=32 ⇒ dim=2

3. The same for a cube :3 ⇒ need 27=33 ⇒ dim=3

Page 21: Lotkaian Informetrics and applications to social networks L. Egghe Chief Librarian Hasselt University Professor Antwerp University Editor-in-Chief “Journal

Construction of the triadic Koch curve

Page 22: Lotkaian Informetrics and applications to social networks L. Egghe Chief Librarian Hasselt University Professor Antwerp University Editor-in-Chief “Journal

4. For the triadic Koch curve

:3 ⇒ need 4=3D ⇒ dim=Dwith

The Koch curve is a proper fractal with fractal dimension

= Complexity theory

= Fractal theory

Mandelbrot

Page 23: Lotkaian Informetrics and applications to social networks L. Egghe Chief Librarian Hasselt University Professor Antwerp University Editor-in-Chief “Journal

Naranan (Nature, 1970)

Theorem:(i) The number of sources grows exponentially

in time t:

(ii) The number of items in each source grows exponentially in time

(iii) The growth rate in (ii) is the same for every source: (ii) and (iii) together imply a fixed exponential function

for the number of items in each source at time t.

Page 24: Lotkaian Informetrics and applications to social networks L. Egghe Chief Librarian Hasselt University Professor Antwerp University Editor-in-Chief “Journal

Then this IPP is Lotkaian, i.e. the law of Lotka applies: if f(p) denotes the number of sources with p items, we have

where

Page 25: Lotkaian Informetrics and applications to social networks L. Egghe Chief Librarian Hasselt University Professor Antwerp University Editor-in-Chief “Journal

Egghe (2005) (Book and JASIST)

(i) The number of line pieces grows exponentially in time t, here proportional with 4t

(ii),(iii) 1/length of each line piece grows exponentially in time t and with the same growth rate 3. Hence we have growth proportional with 3t.

Page 26: Lotkaian Informetrics and applications to social networks L. Egghe Chief Librarian Hasselt University Professor Antwerp University Editor-in-Chief “Journal

Rephrased in terms of informetrics: a (Lotkaian) IPP is a self-similar fractal and its fractal dimension is given by the logarithm of the growth rate of the sources, divided by the logarithm of the growth rate of the items.

(which can be > or < 1). Hence, the exponent in Lotka’s law satisfies the important relation:

This result was earlier seen by Mandelbrot but only in the context of (artificial) random texts (hence in linguistics).

Page 27: Lotkaian Informetrics and applications to social networks L. Egghe Chief Librarian Hasselt University Professor Antwerp University Editor-in-Chief “Journal

Further applications of Lotkaian Informetrics

Concentration theory (inequality theory): Lorenz curves (cf. econometrics). Egghe (2005) (Book, Chapter IV).

Fractional modelling of authorship (case of multi-authored articles): determine

= # authors with articles(fractional counting: an author in an m-authored paper receives a score ).

Page 28: Lotkaian Informetrics and applications to social networks L. Egghe Chief Librarian Hasselt University Professor Antwerp University Editor-in-Chief “Journal

Theoretical and experimental fractional frequency distributions (case of i=4).

Page 29: Lotkaian Informetrics and applications to social networks L. Egghe Chief Librarian Hasselt University Professor Antwerp University Editor-in-Chief “Journal

Dynamics of Lotkaian IPPs, described via transformations on the sources and on the items: includes the description of dynamics of networks. Relations with 3-dimensional informetrics: See new journal: L. Egghe. General evolutionary theory of IPPs and applications to the evolution of networks. Journal of Informetrics 1(2), 115-122, 2007

Page 30: Lotkaian Informetrics and applications to social networks L. Egghe Chief Librarian Hasselt University Professor Antwerp University Editor-in-Chief “Journal

Item transformation

Source transformation

New rank-frequency function

Page 31: Lotkaian Informetrics and applications to social networks L. Egghe Chief Librarian Hasselt University Professor Antwerp University Editor-in-Chief “Journal

Theorem: New size-frequency function

where

Page 32: Lotkaian Informetrics and applications to social networks L. Egghe Chief Librarian Hasselt University Professor Antwerp University Editor-in-Chief “Journal

Case is example of “linear 3 dimensional informetrics”

Sources1 → Items1 = Sources2 → Items2

Examples:1. Webpages → hyperlinks → use of

hyperlinks2. Library subject categories → books

→ borrowingsSee further.Back to the general case.

Page 33: Lotkaian Informetrics and applications to social networks L. Egghe Chief Librarian Hasselt University Professor Antwerp University Editor-in-Chief “Journal

Power law transformations in Lotkaian IPPs

Page 34: Lotkaian Informetrics and applications to social networks L. Egghe Chief Librarian Hasselt University Professor Antwerp University Editor-in-Chief “Journal

Theorem:

is only dependent on b/c due to the scale-free nature of Lotkaian systems.

Page 35: Lotkaian Informetrics and applications to social networks L. Egghe Chief Librarian Hasselt University Professor Antwerp University Editor-in-Chief “Journal

Corollary:

With this, one can study the evolution of an IPP, e.g. a part of WWW: V. Cothey (2007): confirms theory except in one case where non-Lotkaian evolution is found, probably due to “automatic” creation of web pages (deviation from a social network).

Page 36: Lotkaian Informetrics and applications to social networks L. Egghe Chief Librarian Hasselt University Professor Antwerp University Editor-in-Chief “Journal

Further application:

IPPs without low productive sources

(Egghe and Rousseau (2006))

Take : sources remain but they grow in number of items:

Now

Page 37: Lotkaian Informetrics and applications to social networks L. Egghe Chief Librarian Hasselt University Professor Antwerp University Editor-in-Chief “Journal

and (since )

Evolution: decreasing Lotka exponent and no low productive sources

Page 38: Lotkaian Informetrics and applications to social networks L. Egghe Chief Librarian Hasselt University Professor Antwerp University Editor-in-Chief “Journal

Examples

1. Country sizes: data from www.gazetteer.de (July 10, 2005): 237 countries : = 1.69 (best fit)

2. Municipalities in Malta (1997 data): 67 municipalities: = 1.12 (best fit)

3. Database sizes: on the topic “fuzzy set theory” (20 largest databases on this topic) (Hood and Wilson (2003)): = 1.09 (best fit)

4. Unique documents in databases (20 databases above): =1.33 (best fit).

Page 39: Lotkaian Informetrics and applications to social networks L. Egghe Chief Librarian Hasselt University Professor Antwerp University Editor-in-Chief “Journal

Application of Lotka’s law to the modelling of the cumulative first-citation distribution

i.e.

the distribution over time at which an article receives its first citation.

Page 40: Lotkaian Informetrics and applications to social networks L. Egghe Chief Librarian Hasselt University Professor Antwerp University Editor-in-Chief “Journal

The time t1 at which an article receives its first citation is an important indicator of the visibility of research.

At t1 the article switches its status from “unused” to “used”.

t1 is a measure of immediacy but, of course, different from the immediacy index (Thomson Scientific).

Page 41: Lotkaian Informetrics and applications to social networks L. Egghe Chief Librarian Hasselt University Professor Antwerp University Editor-in-Chief “Journal

The distribution of t1 over a group of articles is the topic of the present study. We will study the cumulative first-citation distribution

= cumulative fraction of all papers that have, at t1, at least 1 citation.

Page 42: Lotkaian Informetrics and applications to social networks L. Egghe Chief Librarian Hasselt University Professor Antwerp University Editor-in-Chief “Journal

Rousseau (1994) uses two different differential equations to model two types of graphs: a concave one and an S-shaped one. These equations are not explained and are not linked to any informetric distribution.

Page 43: Lotkaian Informetrics and applications to social networks L. Egghe Chief Librarian Hasselt University Professor Antwerp University Editor-in-Chief “Journal

In Egghe (2000), I use only 2 elementary informetric tools :

= the density function of citations to an article, t time after its publication (exponential, ),

= the density function of the number of papers with A citations in total (Lotka,

), (only ever cited papers are used here).

Page 44: Lotkaian Informetrics and applications to social networks L. Egghe Chief Librarian Hasselt University Professor Antwerp University Editor-in-Chief “Journal

Normalizing to distributions :

becomes for an article with

A citations in total

becomes but we will use

the fraction of ever cited articles, in order to include also the never cited articles.

Page 45: Lotkaian Informetrics and applications to social networks L. Egghe Chief Librarian Hasselt University Professor Antwerp University Editor-in-Chief “Journal

Theorem :

concave if S-shaped if

, hence explaining both shapes in one model.

Note the turning point of .

Page 46: Lotkaian Informetrics and applications to social networks L. Egghe Chief Librarian Hasselt University Professor Antwerp University Editor-in-Chief “Journal

Proof : A first citation is received if

(*)

⇒ Cumulative fraction of all articles

that are already cited at time t1:

(**)

⇒ (*) into (**) yields

Page 47: Lotkaian Informetrics and applications to social networks L. Egghe Chief Librarian Hasselt University Professor Antwerp University Editor-in-Chief “Journal

Motylev (1981)

Page 48: Lotkaian Informetrics and applications to social networks L. Egghe Chief Librarian Hasselt University Professor Antwerp University Editor-in-Chief “Journal

fit :

Page 49: Lotkaian Informetrics and applications to social networks L. Egghe Chief Librarian Hasselt University Professor Antwerp University Editor-in-Chief “Journal

fit :

Rousseau (1994)JACS to JACS data of Rousseau

Time-unit = 2 weeks, 4-year period