Beyond keywords: A network science approach
to the structure of SPSSI
Kevin Lanning, Fla Atl U
The reciprocal relevance of SPSSI andnetwork science
HistoricalLewin, Heider, Milgram, …
StructuralA languagefor the study of communities
Next U
NY Times, 2/19/2012
Network science can inform the study of inequality
I: Structure
Preferential attachment orthe ‘Matthew Effect’
A common feature of diverse complex social systems
Not just citationsA source of inequality
Network science can inform the study of inequality
II: Content
A pilot analysis32 ASAP papers on “inequality” linked by 1573 references6 research communities
Explorations of the SPSSI citation network
Networks parameters have meaning at five levels of analysis
Level of analysis Concept / parameter Relevance / interpretation
Network (dynamic) Preferential attachment Developmental trajectories of topics, scholars
Network (static) Giant component Connectedness of a research community
Community Modularity Topics, subdisciplines, cliques, categories
Path Diameter, path length Distance and proximity of papers, scholars…
Author/paper (In)degree, centrality Mechanisms of influence, impact, eminence
Bigger data
• All papers published in JSI, ASAP from 2001-2013.• First author, journal, year, cited papers
• N sources = 855• 38854 references(45.4 per source)
• - 2,042 self-references (5.3%)• - 3,198 (8.2%) unusable: references to news articles, government institutes, or
without a date____________________________
• 33,615 usable citations (86.5%)• 24,263 unique papers• 14,702 unique first authors
SPSSI citation network: Connectedness
• Of the 24,263 papers, 24,075 (99.2%) are linked in a single giant component
• Papers are separated by an average of 4.2 links
Eminence and network centrality: 3 interpretations
ID: Citation counts from different sources (in-degree), or total cites (weighted in-degree)
PR (Page Rank, Eigenvector Centrality): Recursive measures in which the importance of a paper is dependent upon the importance of papers which refer to it
BC (Betweenness Centrality): Extent to which a node bridges different areas of scholarship, introduces work to a new audience, etc.
PageRank is high for papers with commentary• King (2011)
• Second highest PR in database
• Explanation• Papers which are cited by papers with few
references (such as commentaries) can have a disproportionate impact in a sparse network
• Two solutions• Omit commentaries and book reviews • Treat authors rather than papers as the unit of analysis
• Limitations of citation networks: sparseness, time-constraint
The SPSSI author network: (almost) no one is an island• 14,703 unique authors
• All but 6 are linked to the main • Average path between nodes =
5.1
• 32-38 communities*• Average author is linked
1.9 times
Whole network
The SPSSI author
network:Most cited
Includes 68 authors with 20 or more
citations. Nodes ranked by eigenvector
centrality
The SPSSI author network: Centrality
• Content of rankings• Betweenness
(bridging centrality)vs. other measures
Gender effects in citationnetworks?• King (2014): Self-citations• Here, a modest but
possibly consequential effect
Directed Undirectedr (gender, BC-EC) = 0.17 0.22t = 1.70 2.18p (one tailed) 0.05 0.02JSI/ASAP network; analysis includesonly top, bottom 50 in BC-EC (not effect sizes)
The SPSSI author network: Allport and Lewin communities compared
Lewin community includes authors with 5 or more cites; Allport includes authors with 13+ cites. Nodes ranked by eigenvector centrality
(How) has ASAP changed SPSSI?
Total JSI ASAPonly ASAP
unique authors 696 491 233 205unique cited 14568 11704 4848 2864unique scholars (nodes) 14702 11778 4942 2924
Summary, concluding thoughts• Eminence: Great persons and beyond• Centrality: Different measures have distinct interpretations• Connectedness: To see small worlds, you need big data• Communities: Discrete clusters are artificial• Distance: Is more interpretable than proximity• Obsolescence: This work is primitive
• Bigger data and much more sophisticated methods lie ahead
…Safe home
Content: Allport, Pettigrew, Tajfel. But we should resist the temptation to focus only on Great Men, on persons without situations, on the fallacy of independence. As Heather Bullock reminded us in her talk, none of us has built our work alone; as Stephanie Fryberg noted, we need to consider interdependence as well as independent sources of scholarly achievements.CentralitySmall worlds: The law of large numbers applies in how we get to the truth of our connectedness. Giant components and small worlds are more apparent as our data become more complete.Citation networks in personality and social psychology are small worlds in which virtually all of us can be connectedOn articulating the spaceClustering: “Communities” are fuzzy, artificial, and lack robustnessDistance and proximityObsolescencePrimitive: small big data – King studied 1.6 million cites. Others have looked at similar qs in a much more sophisticated way.First authors as opposed to all authorsAuthors as compared with full citationBoyack - more coherent networks can be obtained if one also assesses how far apart they are cited in the source...for example, in the beginning of the introduction or in the methods section.