Pip Pattison University of Melbourne UKSNA, University of Greenwich, June 2013 A hierarchy of...
If you can't read please download the document
Pip Pattison University of Melbourne UKSNA, University of Greenwich, June 2013 A hierarchy of exponential random graph models for the analysis of social
Pip Pattison University of Melbourne UKSNA, University of
Greenwich, June 2013 A hierarchy of exponential random graph models
for the analysis of social networks
Slide 2
Acknowledgments Joint work with Garry Robins, Peng Wang and Tom
Snijders University of Melbourne Garry Robins, Peng Wang, Galina
Daraganova, David Rolls University of Oxford Tom Snijders
University of Manchester Johan Koskinen Swinburne University Dean
Lusher
Slide 3
Outline 1.Structure in networks 2.The ERGM framework for
network modelling 3.Hierarchy of dependence structures for ERGMs
4.Five networks 5.Applications
Slide 4
1. Structure in networks
Slide 5
Cartwright and Harary: Psychological Review, 1956 We expect:
negative ties to be bi-partite in form (or k-partite in
generalisations) positive ties to be potentially clustered
Slide 6
Granovetter: American Journal of Sociology, 1973 We expect:
closed triangles in strong ties local bridges to be weak
Slide 7
Jackson & Wolinksy: Journal of Economic Theory, 1996 We
expect: disconnected cliques stars
Slide 8
Watts & Strogatz: Nature, 1998 We expect: High
concentration of triangles Short paths Low density Absence of
hubs
Slide 9
Degree effects: degree assortativity and dissassortativity
(e.g. Newman, 2003) We expect: relatively high (or low) rates of
connection among high- degree nodes
Slide 10
Burt: American Journal of Sociology, 2004 Robins (2009): We
expect to see brokers who are: embedded in groups bridging to other
groups
Slide 11
Bearman, Moody & Stovel: American Journal of Sociology,
2004 We expect: An absence of 4- cycles (and 3-cycles)
Slide 12
Jackson, Rodriguez-Barraquer & Tan: American Economic
review, 2012 We expect: m-cliques but not (m+1)-cycles
Slide 13
An aside Paper Citations (WoS, June 26, 2013) Cartwright &
Harary (1956) 534 Granovetter (1973)5833 Jackson & Wolinsky
(1996) 416 Watts & Strogatz (1998)7572 Newman (2003) 507 Burt
(2004) 491 Bearman, Moody & Stovel (2004) 133 Jackson,
Rodriguez-Barraquer & Tan (2012) 0 Our fascination with network
structure runs deep!
Slide 14
Other regularities in network structure Other hypothesised
sources of regularity in network structure include: Homophily and
heterophily effects (e.g. McPherson, Smith-Lovin & Cook, 2001)
Consequences of social foci and other settings (Feld, 1985;
Pattison & Robins, 2002) Embedding in geographical,
organisational and sociocultural contexts (e.g. Daraganova et al,
2012; Lomi et al, in press; White, 1992) Interdependence or
embeddedness with other networks (e.g. Granovetter, 1985; Padgett
& McLean, 2006)
Slide 15
Harrison White on network ties Notably, almost all of these
hypotheses about structural regularity are based on arguments about
local interaction in networks: A social tie exists in, and only in,
a relation between actors which catenates, that is entails (some)
compound relation through other such ties of those actors. Thus it
is subject to, and known to be subject to, the hegemonic pressures
of others engaged in the social construction of that network
(White, 1998)
Slide 16
2. General modelling framework
Slide 17
Network models Network models should: reflect known and
hypothesised processes for network tie formation (such as those
just mentioned) be dynamic, where possible, and consistent with
known or hypothesised dynamics allow us to test propositions about
network structure and process allow us to understand the
consequences of network structure and process For cross-sectional
data, the exponential random graph modelling (ERGM) framework is
convenient
Slide 18
Exponential random graph models (ERGMs) We regard the nodes of
a network as fixed, and treat potential ties among nodes as
variables that are dependent on exogenous attributes of the nodes
and potential ties and, potentially, on one another. The form of
assumed dependence among tie variables leads to a general form of a
probability model (an exponential random graph model) for the
ensemble of tie variables Additional simplifying assumptions The
model can be estimated using MCMCMLE from an observation on the
network (and relevant node- or dyad-level covariates) - see
Snijders (2002)
Slide 19
Exponential random graph model (ERGM) Y(i,j) is a tie variable:
Y(i,j) = 1 if node i is tied to node j, 0 otherwise Ensemble of tie
variables: Y = [Y(i,j)] tie variables y = [y(i,j)]realisations
P(Y=y) = (1/ ( )) exp{ p p z p (y)} Frank & Strauss (1986) z p
(y) are network statistics p are corresponding parameters ( ) is a
normalising quantity Network effects
Slide 20
3. Dependence structures
Slide 21
Characterising the proximity of potential network ties Under
what circumstances is the tie linking node a and node b
conditionally dependent on the tie linking node c and node d? a cd
b When each of actors a and b is already linked to both actors c
and d, and conversely? Strict inclusion
Slide 22
Characterising the proximity of potential network ties Under
what circumstances is the tie linking node a and node b
conditionally dependent on the tie linking node c and node d? a cd
b When each of actors a and b is already linked to at least one of
actors c and d, and conversely? Inclusion
Slide 23
Characterising the proximity of potential network ties Under
what circumstances is the tie linking node a and node b
conditionally dependent on the tie linking node c and node d? a cd
b When at least one of actors a and b is already linked to both
actors c and d? Partial inclusion
Slide 24
Characterising the proximity of potential network ties Under
what circumstances is the tie linking node a and node b
conditionally dependent on the tie linking node c and node d? a cd
b When at least one of actors a and b is already linked to at least
one of actors c and d, and conversely? Distance criterion
Slide 25
A second dimension: varying path length a. Strict p-inclusion
SI p (p>0) ab c d b. p-inclusion I p ab c d c. Partial
p-inclusion PI p ab c d d. p-distance criterion D p ab c d Key: Red
lines indicate existing paths of length p or less (p 0) Blue dashed
lines indicate potential ties, Y ab and Y cd
Slide 26
The dependence hierarchy Pattison & Snijders, 2013) SI 1 I
0 = PI 0 D0D0 I1 I1 PI 1 D1 D1 SI 2 I2I2 PI 2 D2D2
Slide 27
Associated model configurations Each configuration is a
subgraph of diameter p (p- club, Mokken, 1979) For p = 1: cohesive
subsets a cd b SI p : Strict p-inclusion
Slide 28
Associated model configurations Each configuration has the
property that every pair of edges lies on a cyclic walk of length
(2p+2) For p = 1: closure a cd b I p : p-inclusion
Slide 29
Associated model configurations Each configuration has the
property that every pair of edges lies on a cyclic walk of length
(2p+2) or on a cyclic walk of length (2p+1) with an edge incident
to a node on the cycle For p = 1: brokerage a cd b PI p : Partial
p-inclusion
Slide 30
Associated model configurations Each configuration has the
property that every pair of edges lies on a path of length p+2 For
p = 1: connectivity a cd b D p : p-distance
Slide 31
Model configurations for the case of p = 0 SI 0 : not defined I
0 : each configuration is an edge PI 0 : each configuration is an
edge D 0 : each configuration is such that every pair of edges lies
on a path of length 2 Bernoulli or Erds- Rnyi model: edges are
independent Markov model (Frank & Strauss, 1986)
Slide 32
The dependence hierarchy Pattison & Snijders, 2013) SI 1
(clique) I 0 = PI 0 (Bernoulli) D 0 (Markov) I 1 (social circuit)
PI 1 (edge- triangle) D 1 (3-path) SI p (p-club) I p (cyclic walk
of length 2p+2) PI p ((r+1)- path-(2(p - r)+1)-cyclic walk, 0 r
p-1) D p (path of length p) Cohesion Closure Brokerage
Connectivity
Slide 33
Other assumptions 1.Homogeneity: isomorphic configurations have
equal parameters (Frank & Strauss, 1986) 2.Related effects: a
single statistic for a family of related configurations, such as:
m-stars m-triangles, m-2-paths m-edge-triangles (Snijders et al,
2006; Hunter & Handcock, 2006)
Slide 34
Resulting model effects often include: Edge: Propensity for
edge to occur Alternating star: (Endogenous) propensity for edges
to attach to nodes with edges (progressively discounted for
additional edges) hence level of dispersion of degree distribution
Alternating 2-path: Propensity for presence of shared partners
(progressively discounted for additional shared partners)
Alternating triangle: Propensity for an association between an edge
linking nodes and their propensity for shared partners
(progressively discounted for additional shared partners) (closure)
Alternating edge-triangle: Propensity for an association between
degree and closure (progressively discounted for higher
degrees)
Slide 35
4. Five networks
Slide 36
Gift-giving (taro exchange) among households in a Papuan
village* (n = 22) Hage P. and Harary F. (1983). Structural models
in anthropology. Cambridge: Cambridge University Press. Schwimmer
E. (1973). Exchange in the social structure of the Orokaiva. New
York: St Martins.
Slide 37
Interaction network in a university karate club (n = 34)
Zachary W. (1977). An information flow model for conflict and
fission in small groups. Journal of Anthropological Research, 33,
452-473.
Slide 38
Kapferers tailor shop in Zambia, sociational (friendship and
socioemotional) ties, time 2* (n = 39) *Kapferer B. (1972).
Strategy and transaction in an African factory. Manchester:
Manchester University Press.
Slide 39
An Australian government organisation (n=60): important
ties
Slide 40
A dolphin community near Doubtful Sound, NZ* (n = 62) *D.
Lusseau, K. Schneider, O. J. Boisseau, P. Haase, E. Slooten, and S.
M. Dawson, The bottlenose dolphin community of Doubtful Sound
features a large proportion of long-lasting associations,
Behavioral Ecology and Sociobiology 54, 396-405 (2003).
Slide 41
5. Applications
Slide 42
Gift-giving (taro exchange) among households in a Papuan
village* (n = 22) Hage P. and Harary F. (1983). Structural models
in anthropology. Cambridge: Cambridge University Press. Schwimmer
E. (1973). Exchange in the social structure of the Orokaiva. New
York: St Martins.
Slide 43
Heuristic goodness of fit: degree statistics The t statistic
locates the observed value of each statistic in the distribution of
statistics associated with the ergm simulated using model
parameters: if t 2, the observed statistic is within the envelope
expected by the model For example: For the Bernoulli model: edge
effect = -1.59 (est se =.17) statistic observed simulated mean (sd)
t triangles107.481 (4.151) 0.607
Slide 44
Taro exchange: Bernoulli effectsestimatesstderr Edge-1.5900.174
effectsobservedmeanstddevt-ratio 2-star109132.539.0-0.604
3-star80141.667.4-0.913 triangles107.4814.151 0.607 SD
degrees0.9631.6580.261-2.663 Skew degrees1.2540.2360.405 2.515
GCC*0.2750.1600.057 2.017 Mean LCC*0.3390.1510.066 2.851 Var
LCC*0.0450.0440.028 0.044 *GCC is the global clustering
coefficient, LCC is the local clustering coefficient
Slide 45
Taro exchange: edge-triangle models Model 2
effectsestimatesstderr edge-1.1800.524* AT(2.00) 2.2960.602*
AET(2.00)-1.1470.385* Model 3 effectsestimatesstderr edge
1.4722.169 2-star-0.3690.401 triangle 4.6181.511*
edge-triangle-0.5880.283* Both models suggest: Triadic closure A
negative association between participation in closed triads and
degree
Slide 46
Comparison of Models 2 and 3 Model 2Model 3 effectsobs
meanSDt-ratio meanSDt-ratio 2-star109125.027.4-0.6108.315.7 0.1
3-star 80127.548.1-1.075.319.9 0.2 Triangles 109.91.9 0.110.02.8
-0.0 SD_deg0.961.50.2 -2.40.90.154 0.2 Skew _deg1.250.590.5
1.30.0420.428 2.8 GCC0.270.240.1 0.60.2860.098 -0.1 Mean LCC
0.340.390.10 -0.60.3460.115 -0.1 Var LCC0.040.110.03 -2.10.0770.033
-1.0 Model 3 appears to be more closely centred on the data
Slide 47
Taro exchange simulated from Model 3
Slide 48
The edge-triangle model for Taro exchange effectestimatesstderr
edge 1.4722.169 2-star-0.3690.401 triangle 4.6181.511*
ET-0.5880.283* A triadic closure effect, accompanied by a negative
association between triadic closure and tie formation
Slide 49
Interaction network in a university karate club (n = 34)
Zachary W. (1977). An information flow model for conflict and
fission in small groups. Journal of Anthropological Research, 33,
452-473.
Slide 50
Zacharys karate club effect estimatestderr edge-1.9301.553
AS(2.00)-0.5230.459 AT(2.00) 0.6240.191* A2P(2.00) 0.1300.022*
Goodness of fit is good except for: effectobservedmeanstddevt-ratio
5-clique20.0800.3255.905 Positive tendencies for closure in both 3-
and 4-cycles
Slide 51
Kapferers tailor shop in Zambia, sociational (friendship and
socioemotional) ties, time 2* (n = 39) *Kapferer B. (1972).
Strategy and transaction in an African factory. Manchester:
Manchester University Press.
Slide 52
Model 1 effectsestimatesstderr edge-5.0101.567 AS (2.00)
0.1820.478 AT (2.00) 1.3950.286
Slide 53
Model 1: heuristic goodness of fit
effectsobservedmeanstddevt-ratio 2-star29042680.054326.5070.686
3-star1375210781.1071786.5821.663 Triangle451337.12045.8392.484
4-clique448139.80235.0188.801 5-clique23415.2379.22723.709
2-triangle46172164.541461.4985.314 4-cycle38802574.071534.3252.444
1-edge-triangle1846311816.9242197.7263.024
2-edge-triangle13349569665.57316563.0973.854 SD
degrees5.5103.9920.4493.382 Skew degrees0.380-0.0880.4181.118
Global CC0.4660.3770.0136.674 Mean Local CC0.4980.4090.0214.188 Var
Local CC0.0310.0140.0082.061
Slide 54
Model 2 EffectParameterStd Err edge 0.22382.07641 AS
(2.00)-0.88920.55314 AT (2.00) 1.25920.26601 A2P
(2.00)-0.15450.02705
Slide 55
Model 2: heuristic goodness of fit Network
statisticobservedmeanstddevt-ratio 2-star29042330.316975.6880.588
3-star137529240.4315024.2830.898 4-star5173427751.45717889.3001.341
5-star16103766417.66448948.2101.933 triangle451304.266133.4121.100
4-clique448139.81586.0933.580 5-clique23419.12217.80612.068
6-clique560.6991.47537.484 7-clique40.0040.06363.277
2-triangle46172056.6471241.8442.062
3-path3749327640.85315235.9810.647 4-cycle38802369.1841469.8601.028
1-edge-triangle1846310648.4876104.6921.280
2-edge-triangle13349563121.03242879.6381.641 Std Dev degree
dist5.5103.8330.6092.753 Skew degree dist0.380-0.1830.4271.317
Global CC0.4660.3890.0213.620 Mean Local CC0.4980.4240.0312.362 Var
Local CC0.0310.0240.0170.385
Slide 56
Model 3 EffectParameterStd Err edge -1.4422.658 AS
(2.00)-0.5160.707 AT (2.00) 0.9970.294 * A2P (2.00)-0.0660.049 AET
(2.00) 0.022 0.007 *
Slide 57
Model 3: goodness of fit Effectsobservedmeanstddevt-ratio
2-star29043652.456674.099-1.110 3-star1375217389.0314533.265-0.802
4-star5173462849.41621196.612-0.524
5-star161037182779.58676695.796-0.283
3-clique451519.753121.758-0.565 4-clique448366.373162.5940.502
5-clique234102.19986.6521.521 6-clique5612.45222.8891.903
7-clique40.7493.5340.920 2-triangle46174756.7351806.390-0.077
3-path3749351052.20613613.872-0.996
4-cycle38804963.7871679.661-0.645
1-edge-triangle1846321944.5936957.045-0.500
2-edge-triangle133495156834.44762461.843-0.374 SD degree
dist5.5104.4160.4452.458 Skew degree dist0.380-0.0230.3401.183
Global CC0.4660.4230.0251.691 Mean Local CC0.4980.4480.0261.910
Variance Local CC0.0310.0100.0073.096
Slide 58
An Australian government organisation (n=60): important
ties
Slide 59
Model for Australian government organisation
effectsestimatesstderr edge-4.4140.414 * AS(2.00)0.1640.228
AT(2.00)0.5880.165 * A2P(2.00)0.0710.055 The model appears to fit
well A modest and non-significant tendency towards dispersed
degrees, and a moderate closure effect
Slide 60
A dolphin community near Doubtful Sound, NZ* (n = 62) *D.
Lusseau, K. Schneider, O. J. Boisseau, P. Haase, E. Slooten, and S.
M. Dawson, The bottlenose dolphin community of Doubtful Sound
features a large proportion of long-lasting associations,
Behavioral Ecology and Sociobiology 54, 396-405 (2003).
Slide 61
Model 1 EffectParameterStd Err edge-1.94460.90718
alt-star(2.00)-0.55450.26904 alt-triangle (2.00) 0.99060.11496
Slide 62
Model 1: goodness of fit Effectobservedmeanstddev t-ratio #
2-stars923901.4284.6 0.08 # 3-stars18612092.51059.1 -0.22 #
1-triangles9577.0 26.2 0.69 # 2-triangles300166.7100.5 1.33 #
3-paths93259537.52662.3 -0.08 # 4-cycles278196.4121.1 0.67 #
(1,1)-coathangers16441443.3767.6 0.26 # cliques of size 4278.66.0
3.03 # alt-k-indpt.2-path(2.00)705.4737.3195.8 -0.16 Std Dev degree
dist2.93.00.5 -0.13 Skew degree dist0.290.860.37 -1.52 Global
Clustering0.310.260.02 2.69 Mean Local Clustering0.260.300.04 -0.91
Variance Local Clustering0.040.070.02 -1.52
Slide 63
Model 2 EffectParameterStd Err edge-2.82300.92935 1-triangle
2.21230.26306 2-triangle-0.02420.03311
1-edge-triangle-0.04020.01129 alt-star(2.00)-0.13370.28843
Slide 64
Model 2: goodness of fit Effectobservedmeanstddev t-ratio #
2-stars923925.7159.7 -0.02 # 3-stars18611847.2469.9 0.03 #
3-paths93259515.51379.7 -0.14 # 4-cycles278268.7100.3 0.09 #
cliques of size 42731.620.5 -0.23 #
alt-triangle(2.00)177.5181.031.1 -0.11 #
alt-indpt-2-path(2.00)705.4719.7113.7 -0.13 Std Dev degree
dist2.932.820.25 0.44 Skew degree dist0.290.260.23 0.13 Global
Clustering0.310.320.05 -0.18 Mean Local Clustering0.260.290.04
-0.70 Variance Local Clustering0.040.060.016 -1.29
Slide 65
In conclusion The dependence hierarchy systematically
articulates possible proximity-based logics for conditional
dependencies between network ties and yields: A versatile modelling
framework to reflect a variety of hypothesised tie formation
processes The illustrative applications demonstrate the potential
value of this flexible framework, and suggest evidence for various
hypothesised processes There is, of course, much more to be done,
e.g.: evaluating model adequacy comparing models ensuring robust
model specifications...