Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Algorithmic Complexity and Structural Models of Social Networks∗
Christopher WheatMIT Sloan School of Management
50 Memorial DriveCambridge, MA 02142-1347
April 16, 2007
Abstract
This article explores how the algorithmic complexity approach can be used to address the problem ofidentifying group structures in social networks. A specific implementation of the algorithmic complexityapproach based on the principle of minimum description length (MDL) is compared to other modelselection criteria, and compared and contrasted with a Bayesian approach to model selection. Themethod presented here provides a statistical basis for determining how many groups actors in a givennetwork should be partitioned into. Additionally, this article explores the analysis of two independentmechanisms by which group structures might be produced in social networks—those associated withexplicit categories and those associated with preferential attention to particular local structures. I outlinea method for using p1 stochastic blockmodels and exponential random graph models (ERGMs) in thecontext of the identified algorithmic complexity approach to address this question, and demonstrate themethod in two empirical settings.
1 Introduction
Much of contemporary sociology is rooted in two simple premises about the nature of social structure. The
first is that some structures can be most usefully understood in terms of the social categories that actors are
assigned to. A related second premise is that actors meaningfully attend to the boundaries between these
categories (Lamont and Molnar 2002). For instance, the argument that career paths are best understood
not in terms of a series of isolated job-to-job transitions, but rather as elaborations of a set of ideal typical
set of career paths (Abbott and Hrycak 1990; Stovel et al. 1996; Han and Moen 1999) is one example of the
exploration of these premises. This work develops the basic idea that actors and the institutions that embed
them treat these career paths as meaningfully typed social objects, and that once an actor is in a given
career path, it is relatively unlikely that she will cross the boundary into another. In a different context,∗I would like to thank Peter Marsden, Tiziana Casciaro, David Gibson, Joel Podolny, Nitin Norhia and Kate Kellogg for
their invaluable feedback on this and earlier versions of this work. I would also like to thank David Hunter for assistance withstatnet and curved exponential model estimation. Some of the analysis was performed with statnet 1.0 developed with supportfrom NIH grants R01DA012831 and R01HD041877.
1
Phillips and Zuckerman (2001) argue that when actors can be divided into high- and low-status categories,
high-status actors near the boundary between the two will behave in a distinctive status-preserving way. In
each of these cases, a core argument is presented based on the idea that the social behavior of an actor is
not determined strictly by her individual characteristics and motivations, but also by characteristics of and
structures around meaningful social categories.
The identification of boundaries within social systems is, as such, an issue of interest in a wide range
of sociological phenomena. That said, it presents a particularly significant opportunity for sociologists who
employ social network methods in their analyses (White et al. 1976; Laumann et al. 1978). Social network
constructs such as structural equivalence (Lorrain and White 1971) and regular equivalence (White and
Reitz 1983) provide a facility by which similarities and differences between actors can be quantitatively
measured. When used in combination with various clustering techniques, these constructs have been used to
analyze group structures within a wide range of social systems (White et al. 1976; Gerlach 1992; Alderson
and Beckfield 2004). These blockmodel analyses provide the basis of a formal method that can be used to
assess group structure in networks.
Blockmodel analyses such as these have widely been used to empirically assess group structure in net-
works. However, there is a significant shortcoming inherent in the method as it has typically been applied.
The logic of structural or regular equivalence is directly applicable to the determination of the similarity
or difference between a pair of actors, and as such represents a useful first step in the process of identify-
ing meaningful groupings. This information cannot, however, be used in isolation to identify meaningfully
bounded groups. A complete answer to this question requires a method that can be used to identify cases
in which two actors that are not precisely identical should be included within the same group, or likewise,
when two actors that are sufficiently distinct should be partitioned into different groups. Such a method is
qualitatively distinct from a method that can only measure the degree to which a particular pair of actors is
similar or different. Moreover, even to the extent that they provide some information about group structures
in social networks, standard blockmodel analyses provide little information about the processes by which
these structures are produced.
These issues can be illustrated by reviewing the analysis by White et al. (1976) of the pattern of relations
of positive affect in a monastery (Sampson 1968) as depicted in Figure 1. White and his coauthors suggest
that the monks in this study might meaningfully be divided into a number of groups, based on the pattern of
their expressed positive affect toward one another. To the extent that group memberships and the boundaries
between them play a role in shaping affect, the patterning of these relationships should reflect the underlying
2
Young Turks
Loyal Opposition
Outcasts
Figure 1: Sampson’s Monks—Affect Relations
3
Young Turks Loyal Opposition OutcastsYoung Turks 1 0 0
Loyal Opposition 0 1 0Outcasts 0 1 1
Figure 2: Image Matrix for Sampson’s Monks
structure of this social system. In particular, White et al. were interested in identifying “zeroblocks”—the
relative absence of a particular type of relation between two groups in a given system. Accordingly, they
identify a three-group structural model that mirrors the original analysis by Sampson (1968). Figure 2
illustrates the derived schematic set of relationships and their absence. The results of this blockmodel
analysis suggest, for example, that the Outcasts express positive affect toward the Young Turks, but that
the Young Turks do not reciprocate this affection.
This analysis is useful in that it effectively captures the pattern of interaction between the actors in
this social system. It is generally true, for instance, that members of the Young Turk group are unlikely to
express positive affect toward members of the Outcast group, suggesting that the boundary between these
two groups is an important one. The three-group model illustrated here is not a perfect model, however,
in that there are certainly exceptions to the rules it implies. In some sense, a more accurate model might
be obtained by decomposing the three proposed groups into five groups—an alternative model White et al.
(1976: 751-2) also propose. In fact, given that there are no two actors in this system that are completely
equivalent, the logical extension of this argument would be that the most accurate model would be one in
which there are no groups at all and each actor is assigned to his own position.
It is also noteworthy that the only explanatory variables involved in this type of blockmodel analysis are
the group membership of the two actors involved in a given type of social exchange. Both the three-group
and five-group models proposed by White et al. implicitly make the claim that the observed pattern of
exchanges is explained simply by the group memberships of each actor. While this may be the case, there
are other mechanisms besides explicit group membership and attentiveness to social boundaries, such as
tendencies toward reciprocity and transitive closure, that might produce clustered social networks. Inasmuch
as blockmodel analyses of this sort cannot account for these mechanisms and processes, they are further
constrained in their ability to make claims about the presence or absence of explicit social boundaries.
The fundamental problem illustrated by this example is that the logic of equivalence, while being quite
useful for determining the extent to which two actors are similar or different, is by itself less insufficient for
determining which differences or similarities are substantively meaningful. There are a set of goodness-of-fit
4
measures (Carrington et al. 1979; Carrington and Heil 1981; Wasserman and Faust 1994) that can be used,
along with these equivalence measures, to narrow down the total number of group-based models of social
structure that might be applied to a network of interactions. Selecting between models that have varying
numbers of explanatory groups, particularly to the extent that other social processes and mechanisms are
accounted for, requires a different approach.
One approach to identifying the presence or absence of explicit groups and boundaries in a social network
is to cast the problem in terms of model selection. To the extent that group memberships are used to predict
whether or not a given pair of actors will socially interact, models with varying numbers of subgroups
can be viewed as predictive models with varying numbers of explanatory variables. Similarly, models that
consider other cluster-producing mechanisms such as reciprocity and transitivity will include additional
predictive variables. In general, the problem of selecting between models differentiated by their use of
explanatory variables has recently attracted the interest of researchers who study social systems (Burnham
and Anderson 2004; Kuha 2004; Stine 2004). In particular, the Akaike information criterion (AIC) (Akaike
1974) and the Bayesian information criterion (BIC) (Raftery 1995) have been proposed as metrics that can be
used to establish the appropriate number of explanatory variables in the context of linear regression models.
In a similar vein, Stine (2004) proposes that the Minimum Description Length (MDL) criterion (Rissanen
1983, 1989) is a particularly useful metric for solving this problem, because it offers a more explicit mechanism
for taking account of the total set of models considered.
In this article, I argue that the MDL approach in particular and the algorithmic complexity approach in
general are not only useful for model selection in the regression context as argued by Stine (2004), but is also
particularly useful for model selection in the context of structural models of social networks. Specifically, I
argue that the boundary identification problem can be solved in this context by relying upon the explicit and
theoretically grounded trade-off that the algorithmic complexity approach makes between model complexity
and model accuracy. In the following section I present a general discussion of the algorithmic complexity
framework and how it addresses issues presented by the problem of model selection. Section 3 presents a a
specific application of the algorithmic complexity framework to model selection in the context of stochastic
statistical models of network structure. In Section 4 I demonstrate the use of the the general approach in
the analysis of group structure in two empirical examples.
5
2 Algorithmic Complexity and Model Selection
The example of Sampson’s monastery demonstrates that the task of identifying the boundaries in a so-
cial system is equivalent to determining the salient group memberships in a population that govern social
interaction. Any population of actors can potentially be divided into subgroups in a wide range of ways—
the three-group and five-group models proposed by White et al. (1976) are only two of many possibilities.
Each of these partitions of actors into groups can be conceptualized as a model of behavior that takes a
corresponding set of boundaries into account.
The consideration of the extent to which a given model of social exchange is an appropriate representation
of a set of observed behavior can be decomposed into three conceptually distinct questions. The first question
concerns the accuracy of the model—the correspondence between the behavior that it describes and the
behavior that is actually observed. The five-group model of affect in the Sampson monastery is, in this
sense, more accurate than the three-group model. A second question concerns the specificity of the model—
the extent to which a particular model is likely to be constrained to a particular set of observed possibilities.
Models which are less specific, or more generalizable, are typically thought of as being more useful models in
some general sense. So while the five-group model of affect is more accurate than is the three-group model,
it is also more specific. Inasmuch as these first two questions reflect competing concerns, a third question
concerns the issue of how to balance the model accuracy with model specificity.
In the context of the analysis of blockmodels of social networks in particular, there is a substantial body
of research that has focused primarily upon the first of these questions. Much of the early research in
this area (Carrington et al. 1979; Carrington and Heil 1981; Panning 1982; see also Wasserman and Faust
1994) focused on deterministic blockmodels—models in which empirically observed exchanges are essentially
conceptualized as equivalent to social structure rather than a realization thereof. Subsequent developments of
stochastic rather than deterministic conceptualizations of network structure models (Holland and Leinhardt
1981; Wasserman and Pattison 1996; Anderson et al. 1999; van Duijn et al. 2004) led to a general consensus
that the accuracy of a particular network model should be measured in terms of its ability to predict an
observed pattern of ties with high probability. A number of studies have applied this general logic specifically
to the problem of studying group structure using network data (Fienberg and Wasserman 1981; Holland et al.
1983; Wang and Wong 1987; Anderson et al. 1992; Snijders and Nowicki 1997; Nowicki and Snijders 2001).
While there has been considerable development in refining approaches to solving the problem associated
with the first question, there is somewhat less agreement about the second and the third. In the context
of stochastic blockmodeling, a G2 likelihood-ratio statistic has been proposed as a useful way to compare
6
multiple network models, specifically in the case where one candidate model is nested in another (Anderson
et al. 1992). Other approaches have been suggested as well (Snijders and Nowicki 1997; Nowicki and Snijders
2001), but few approaches have been proposed that offer a theoretically grounded rationale for combining a
measure of model accuracy with an assessment of the properties of the model itself.
In this section, I argue that model selection approaches based on the logic of algorithmic complexity
(Solomonoff 1964; Kolmogorov 1965; Chaitin 1966) dominate all other approaches. An approach based on
algorithmic complexity has the advantage of addressing all three of these questions in a consistent logic
grounded in probability theory and leads to a selection logic that can be clearly explained and explicitly
compared to other model selection approaches. Accordingly, after outlining the basic features of the approach,
I demonstrate how other approaches that have been recently proposed can be analyzed using the same general
framework, and discuss some of the implications of using these other selection criteria.
2.1 Algorithmic Complexity
A seminal stream of ideas in the field of information theory that has some bearing on the problem of identi-
fying structure in social exchange behavior concerns the proposition that any set of data has a measurable
amount of randomness, or conversely, non-random structure. The unique contribution of these ideas to the
problem of balancing the accuracy of a model with its specificity is the way in which this theory provides
a framework in which both of these concepts are measured in the same terms. There are several model
selection measures that are based in this tradition, and each differs from the other in subtle but important
ways. While these differences can in certain cases be meaningful, they are less significant than the way
in which these measures differ from other measures that have been commonly applied to model selection
problems in the social sciences.
The central idea in this stream of research is that abstract data can be characterized in terms of the
amount of information that it communicates—which is to say, the extent to which it is a representation of
structured phenomenon rather than random noise (Shannon 1948). To the extent that a set of observations
is in fact structured rather than completely random, the structure embedded in the data should be useful in
finding an efficient non-redundant representation of that data (Chaitin 1966; Kolmogorov 1965). Algorithmic
complexity is a measure of this size of this most-efficient representation, typically referring to the length of
the computer code that expresses the most efficient algorithm that would reproduce the original set of data.
Two extensions of this idea that are particularly relevant to the task of model selection in the social
science context are the subsequent independent developments of the Minimum Message Length (Wallace and
7
Boulton 1968) and the Minimum Description Length (Rissanen 1983, 1989) measures. While the Minimum
Message Length (MML) and Minimum Description Length (MDL) measures vary both in their theoretical
underpinnings and in some of their practical implications (Wallace and Dowe 1999), they have in common a
reliance on the decomposition of the algorithmic complexity measure into two parts, one of which refers to
the accuracy of a model, and another which refers to model specificity. Rather than requiring researchers to
focus on finding the shortest arbitrary algorithm that could reproduce a particular set of observations, this
decomposition allows the problem of model selection to be posed in a way that is compatible with existing
social science approaches to modeling structure.
Both the MDL and the MML approach measure algorithmic complexity in terms of the number of binary
digits (bits) it would theoretically take to specify both the features of a given structural model θ and a given
observation x conditional on the indicated model representing a true hypothesis, such that
L(θ, x) = L(θ) + L(x|θ), (1)
where L(θ, x) is the total complexity, L(θ) is a measure of the complexity of the model, and L(x|θ) is the
complexity of the data conditioned on the model. A key result due to Shannon is that L(x|θ) is equal to
−lg(p(x|θ)), such that the total complexity
L(θ, x) = L(θ)− lg(p(x|θ)). (2)
The similarity of this expression to Bayes’ theorem leads to a particularly useful result for model selection
in the context of stochastic models of social structure. Bayes theorem states that
p(θ|x)p(x) = p(θ)p(x|θ). (3)
For a given observation of x, p(x) is either undefined or fixed. If the model selection task in general is
characterized as finding the most likely model of structure given observed data x, then maximizing the
right hand side of Equation 3 is equivalent to maximizing p(θ|x) and explicitly solving the model selection
problem. Similarly, if we define L(θ) as −lg(p(θ)), then Equation 2 is just a binary logarithmic expression
of Equation 3.
It is, of course, non-trivial to assume that L(θ) can be defined this way, or even that prior probabilities p(θ)
can be assigned to abstract structural models—the theories underlying MML and MDL diverge precisely on
8
this point. Nevertheless, the logic of this decomposition poses a compelling question to the broader problem
of model selection. The intuitive logic of Occam’s Razor is captured in an expression with explicit grounding
in probability theory. Consider a dyad in which the presence or absence of exchange can be explained by
two candidate models of the organization in which the dyad is embedded. The model θ1 is characterized by
a single dichotomous parameter x1 ∈ {non-profit, for-profit}. A second model θ2 is characterized by both
x1 and a second dichotomous parameter x2 ∈ {large, small}. The two possible models consistent with the
one-parameter model θ1 can each be described with a single bit, and in the absence of any other information
about the organization in which the dyad is embedded, each of these models might be assigned a prior
probability of 1/2. Similarly, the four possible models consistent with the two-parameter model θ2 can each
be described with two bits and might each be assigned an equal probability of 1/4.
In general, the idea that models characterized by a smaller set of dichotomous binary parameters should
be less likely than those characterized by a larger set can be expanded to cover models with more expres-
sive parameters. Rissanen (1983) proposes a prior for unbounded integral parameters, in which the prior
probability for an natural number z is defined as
L0(z) = lg∗(z) + lg c. (4)
In this expression, lg∗(x) = lg x+lg lg x+· · · , where the sum includes only the positive terms of the sequence,
and the constant c =∑
2−lg∗(n) ' 2.865064. Integers (that is, positive and negative natural numbers) can
be encoded by adding a single bit to indicate the sign of the number.
Real-valued parameters pose a potential problem to using a probabilistic approach for model selection.
The uncountability of real numbers or even interval-subsets of R means that assigning any positive probability
to every possible real-value of a model parameter would violate the law of total probability. As such, in order
to assess the probability of a real-valued parameter, they must be mapped to some countable subset. Rissanen
(1989) approaches this problem by arguing that for n observations of data, there is an optimal precision d,
such that the parameter should be limited to taking on 2d values. This precision can be determined as
d(n) =lg n
2+ cd, (5)
where cd is a negligible constant related to the curvature of the description length function (Rissanen 1989:
56).
Inasmuch as most stochastic blockmodels can be represented with either integral or real-valued inde-
9
pendent parameters, these results can be straightforwardly applied to the task of blockmodel selection, and
accordingly, to the task of using stochastic blockmodels to identify boundaries in populations. These results,
and Equations 2 and 3 in particular can moreover be used as a theoretically grounded framework against
which other model selection criteria can be evaluated.
2.2 Alternative Model Selection Criteria
The MDL and MML approaches are not unique in their separate evaluation of model accuracy and model
specificity, nor are they unique in employing a stochastic assessment p(x|θ) in measuring model accuracy.
Consider the AIC (Akaike 1974) and BIC (Raftery 1995) mentioned above. Each of these criteria (to a
scaling factor) can be represented in such a way as to measure the fit of a model to observed data as p(x|θ)
as follows.
AIC(x, θ) = −|θ|+ lg p(x|θ) (6)
BIC(x, θ) = −|θ|2
lg n + lg p(x|θ) (7)
where |θ| is a count of the number of parameters in the model θ, n is equal to the number of observations.
Each of these expressions also corresponds to a logarithmic representation of Equation 3. Using Bayes’
formula and Equation 2 to draw implications about how each of these expressions would assign a prior
probability p(θ) to the observation of a given model yields
pAIC(θ) = 2−|θ| (8)
pBIC(θ) =√
n−|θ|
. (9)
Like the algorithmic complexity approaches outlined above, both of these measures evaluate models with
increasing numbers of parameters as having lower prior probabilities. The principal distinction between
Equation 8 and Equation 9 concerns their respective sensitivity to the number of observations n being
modeled, and the implications of this sensitivity for the assignment of probability to possible models.
Recall the example of Sampson’s monastery discussed above. The blockmodel illustrated in Figure 2
describes one of many possible models of the pattern of affect relations between the monks. A model of
these relations based on three positions would at the very least have nine parameters, corresponding to the
nine cells in the table. In the case where these are dichotomous parameters, there are 29 possible models
10
covering relations between these three positions. Following the logic outlined in the prior section, this has
direct implications for the precision used in representing model parameters. According to Equation 8, every
parameter in a model should only be represented by a single bit. Put differently, in order for penalty imposed
by the AIC to correspond to a prior probability assessment on θ, the independent parameters that compose
θ must be dichotomous.
The first term of Equation 7 has a similar set of implications. As noted by Stine (2004: 243), for large n,
the penalty imposed by the BIC is similar to that of an algorithmic complexity model probability assessment.
However, the BIC penalty forces every parameter to be evaluated as a truncated real-valued parameter, and
assumption that may not be validated in every case. Moreover, as Stine (2004) goes through considerable
trouble to demonstrate, the BIC is not flexible enough to manage the model selection task in which only
limited subsets of possible models are considered by a researcher. In general, straightforward application of
the BIC places limits on the assignment of model priors that may not in every case be appropriate.
Wasserman and Faust (1994) propose a third alternative to model selection specifically in the context
of stochastic blockmodeling. This alternative is only applicable to stochastic blockmodels that define a
predicted value xij for every observed tie value xij . Wasserman and Faust propose that the likelihood-ratio
statistic G2 should be used to assess the goodness-of-fit of a stochastic blockmodel θ,
G2θ = 2
∑i,j
xij log(xij/xij). (10)
Besides the fact that it is unclear how to directly combine this metric with a measure of model accuracy such
as p(x|θ), it is not clear that there is a sound statistical basis to use this metric for many model selection
problems. In general, G2 metrics are used to compare nested models, and there are many model selection
problems, particularly in the context of attempting to locate boundaries in social systems, where there will
be candidate models that are not nested submodels. Moreover, Wasserman and Faust (1994: 703) note that
“this theory should be applied only to a priori stochastic blockmodels, because the ‘data mucking’ that must
be done to fit their a posteriori counterparts invalidates the use of a statistical theory”. The algorithmic
complexity approach outlined here provides a much clearer interpretation of what the model selection task
is, and is as such better suited to the general task of identifying boundaries in social exchange data.
11
3 Algorithmic Complexity and Stochastic Blockmodeling
One of the particularly useful features of the algorithmic complexity approach to model selection is that it
can be used to choose between models that are not nested within each other, or generally characterized by
any specific functional relationship. This has a particular bearing on the problem of identifying groups and
boundaries in social networks due to the wide range of features that can produce exchange behavior that
results in these kinds of structures. Each of these network features or mechanisms typically corresponds to
a parameter or set of parameters that must be included in a given statistical network model. In order to
use the algorithmic complexity approach to compare these models, and in particular to determine L(θ) in
Equation 2, these features must be related to a specific set of model parameters, each with a specified level
of parameter precision.
To that end, this section proceeds as follows. I begin by outlining a baseline set of statistical features
that forms the core of most statistical models of network exchange. While these features do not per se relate
to group and/or boundary structures in networks, they do correspond to well-known features of exchange
patterns in general. The failure to include these features in one form or another in a statistical network
model can distort the estimation of the sought-after group effects. I then discuss a set of network features
and corresponding stochastic blockmodels in which the group memberships actors directly and explicitly
affects the pattern of network exchange. Finally, I discuss a set of local network processes that can produce
group and boundary structures without explicit reference to nominal groups.
3.1 Reciprocity and Individual Variation
Two of the earliest contributions to the statistical study of social networks were the observations that actors
in real social networks are likely to reciprocate the exchange behavior of their interaction partners, and
that these actors are generally differentiated in the extent to which they participate in exchange. The
p1 stochastic network model (Holland and Leinhardt 1981) is an extension of a Bernoulli random graph
that seeks to capture precisely these features of social networks. Given the interest in explicitly modeling
reciprocity as a network feature, the outcome variables in the p1 model are dyads Dij rather than ties xij .
To this end, Holland and Leinhardt (1981) begin with the MAN (mutual, asymmetric, and null) distribution
12
for dyads Dij which states that
mij = p(Dij = (1, 1)), (11a)
aij = p(Dij = (0, 1)), (11b)
aji = p(Dij = (1, 0)), (11c)
and
nij = p(Dij = (0, 0)), (11d)
such that
mij + aij + aji + nij = 1. (12)
The p1 distribution expands upon the reciprocity modeled by the MAN distribution by allowing het-
erogeneity in the exchange behavior of individual actors. Specifically, actors in a network are characterized
in terms of αi, which measures productivity, or the tendency of an actor i to generate ties, and βj , which
measures attractiveness, or the tendency of an actor j to receive ties. Based on these parameters, Holland
and Leinhardt propose the following exponential model for a directed graph:
p(x|θp1) = Kexp
∑i<j
ρijxijxji + λijxij + λjixji
, (13)
where
ρij = log(mijnij/aijaji) (14a)
λij = log(aij/nij) (14b)
K =∏i<j
1kij
, (14c)
and
kij = 1 + eλij + eλji + eρij+λij+λji . (14d)
13
The parameter λ is decomposed as
λij = λ + αi + βj for all i 6= j, (15)
such that
α+ = β+ = 0. (16)
The algorithmic complexity of a baseline p1 model can serve as a null hypothesis against which other
models can be compared. This model is characterized by a set of n−1 productivity and reciprocity parameters
and a single reciprocity parameter, each of which is determined by the entire set of g(g − 1) network ties,
where g is the total number of actors in the network. Accordingly, the algorithmic complexity for a p1 model
can be written as
L(θp1, x) = − lg p(x|θp1) + (2(n− 1) + 1)d(g(g − 1)). (17)
While models that incorporate features that explicitly model group or boundary-sensitive exchange may
be more accurate predictors of the observed behavior in a given network, they will also, in general, be more
specific than this model. As such, in those networks in which there is no statistical evidence for group
structure should have a higher total complexity than that given by Equation 17.
3.2 Explicit Categories
One set of mechanisms that can produce exchange behavior that results in the production of group structure
would be comprised of those mechanisms in which explicit attention to group boundaries is implicated. Such
mechanisms either attribute different kinds of behavior to actors in different categories, or make predictions
about behavior between two actors on the basis of their respective group memberships. Stochastic blockmod-
els based upon these mechanisms can either predict group-oriented individual or dyadic behavior separately,
or can analyze the combined effect of both of these mechanisms simultaneously. Each of these approaches
has different implications for the determination of the algorithmic complexity of a stochastic blockmodel of
network exchange.
A straightforward way to model individual behavior as a function of group structure is to assume that
individual structural attributes are directly determined by the group or category that an actor belongs to.
Anderson et al. (1992) propose a basic extension to the p1 model that does this through placing restrictions
on the productivity and attractiveness parameters αi and βj . Extending the concept of structural equiv-
14
alence (Lorrain and White 1971), they define two actors two be stochastically equivalent if they have the
same probability of sending or receiving ties. The restrictions associated with this version of a stochastic
blockmodel, termed here as a role-dependent stochastic blockmodel (RDB) can formally be stated as
φ(i) = φ(i′) = r ⇒
αi = αi′ = αr
βi = βi′ = βr.
(18)
where φ(·) is a partition mapping such that φ(i) = r implies that an actor i is a member of a group r.
These restrictions lead to the following probability distribution for a social network x given the role-
dependent stochastic blockmodel θRDB :
p(x|θRDB) = Kexp{ρm + λx++ +∑
i
αrxi+ +∑
j
βsx+j} (19)
where r = φ(i), s = φ(j), and m is the total number of mutual ties in the network. The reciprocity parameter
ρ is equivalent to ρij as defined in Equation 14a, restricted to be the same for all dyads. Role-dependent
blockmodels are characterized by the two parameters ρ and λ, and two parameters for each block in a
blockmodel. Each of these parameters is effectively determined from the entire population of ties, so they
should all be optimally represented with d(g(g−1)) bits. Thus, the algorithmic complexity should be written
as
L(θRDB , x) = − lg p(x|θRDB) + (2(b− 1) + 2)d(g(g − 1)). (20)
An alternative to modeling the influence of group structures on the behavior of individual actors is
to model the influence of these structures on dyadic exchange. The pair-dependent stochastic blockmodel
(PSB)—one of the earliest stochastic blockmodels proposed—follows this logic quite explicitly (Holland
et al. 1983). A PSB is a model of a network that focuses on the network dyad vectors Dij = (Xij , Xji),
where Xij is a random variable representing the tie strength between an actor i and and actor j in a network
x. A probability distribution p(x) satisfies a pair-dependent stochastic blockmodel with respect to φ if and
only if:
1. the random vectors Dij are statistically independent, and
2. for any nodes i 6= j and i′ 6= j′ if φ(i) = φ(i′) and φ(j) = φ(j′), then the random vectors Dij and Di′j′
are identically distributed.
15
In other words, ties between actors in a block r = φ(i) and actors in a block s = φ(j) should be drawn
from the same distribution, or the probability of a particular dyad configuration existing between two actors
should depend only on the block assignments of the two actors.
This is a general definition which specifies how the probability distributions from which ties are drawn
should relate to block structure, but says nothing of the particular distributions for a particular block-
pair r × s. Holland et al. define the stochastic blockmodel with reciprocity (SBR) as a special case of the
PSB, where the distribution within a pair block is related to the p1 model (Holland and Leinhardt 1981).
Expanding on Equations 11, they define the parameters of the MAN distribution at the block pair level as
mrs = p(Dij = (1, 1)|φ(i) = r, φ(j) = s), (21a)
ars = p(Dij = (0, 1)|φ(i) = r, φ(j) = s), (21b)
asr = p(Dij = (1, 0)|φ(i) = r, φ(j) = s), (21c)
and
nrs = p(Dij = (0, 0)|φ(i) = r, φ(j) = s), (21d)
such that
mrs + ars + asr + nrs = 1. (22)
These quantities can be used to determine the probability of any given dyad Dij in a network conditioned
on a blockmodel θSBR:
p(x|θSBR) =∏i<j
mxijxjirs axij(1−xji)
rs a(1−xij)xjisr n(1−xij)(1−xji)
rs , (23)
where r = φ(i) and s = φ(j). The authors re-express these block-pair parameters in terms of three other
parameters, λrs, λsr and ρrs, defined as
λrs = log(
ars
nrs
), (24a)
λsr = log(
asr
nsr
)(24b)
ρrs = log(
mrsnrs
arsasr
). (24c)
16
The parameter ρrs is symmetric with respect to a block-pair r × s, and measures the tendency for ties
to be reciprocated within a block-pair. The parameters λrs and λsr measure the tendency for ties to be
asymmetrically sent from block r to block s and from block s to block r, respectively. Holland et al. impose
the restriction ρrs = ρ such that reciprocity is constant within a particular network.
The algorithmic complexity of a stochastic blockmodel with reciprocity θSBR can be determined as
follows. For every block pair r×s there is a parameter λrs. The parameter λrs is derived from the set of ties
from block r to block s, and there are potentially grs such ties for each block-pair. Following Equation 5,
the optimal precision for each of these parameters should be d(grs), where
grs =
grgs if r 6= s
gr(gr − 1) if r = s,(25)
and gr is defined as the total number of actors assigned to a position r. The model complexity must therefore
include the sum of the complexities for each of these parameters∑
r,s d(grs). Additionally, there is a single
parameter ρ which is determined from the set of all possible ties. There are (g(g − 1)) such ties, and thus
the model complexity should further include the complexity of this single parameter d(g(g − 1)).
L(θSBR) = d(g(g − 1)) +∑r,s
d(grs). (26)
Given that the complexity for the data in terms of the model L(x|θ) is always equal to − lg p(x|θ), applying
Equation 2 yields
L(θSBR, x) = − lg p(x|θSBR) + d(g(g − 1)) +∑r,s
d(grs). (27)
While the SBR does model dyadic group-level interaction, it does not allow for the heterogeneity among
actors that the p1 network model accounts for. Wang and Wong (1987) propose a p1 stochastic blockmodel
(P1B) that extends the basic p1 model to incorporate the possibility of within and between group effects on
exchange behavior. The p1 stochastic blockmodel is similar to the role-dependent model inasmuch as both
use block structure to decompose the asymmetric tie parameter λij . In this approach to blockmodeling, a
parameter λrs is introduced as an “interaction term”, which allows block structure to affect the tendencies
for ties to exist between blocks. This parameter is defined through the decomposition of λij , such that
λij = λ + αi + βj + λrs, (28)
17
where r = φ(i) and s = φ(j), and λ is taken as a constant across the network. In addition, λrs is subject
to the side constraints λr+ = 0 and λ+s = 0, which are similar to the constraints typically placed on αi and
βj . These constraints lead to the following probability distribution for a p1 blockmodel θP1B :
p(x|θP1B) = Kexp{ρm + λx++ +∑r,s
λrsx++(rs) +∑
i
αixi+ +∑
j
βjx+j} (29)
where x++(rs) represents the total number of ties in the r×s block. Wang and Wong allow for the possibility
that the λrs parameters could be further restricted, such that some λrs values are forced to be equal. For
instance, they consider the case where only two values of λrs are realized in a model, λrs = λd for diagonal
blocks, and λrs = λo for off-diagonal blocks. This model represents a hypothesis that the likelihood of a
tie between actors in the same position is different than the likelihood of a tie between actors in different
positions. In general, they allow for λrs to take on at most k values, where k ≤ (b− 1)2 because of the side
conditions on λrs.
The P1B has k free block parameters, in addition to the 2(n − 1) parameters for individual actor at-
tractiveness and expansiveness, as well as a λ and ρ for the entire network. Each of these parameters
is effectively determined from the population of all ties and should thus be represented with a precision
d(g(g − 1)). Accordingly, the complexity for a p1 blockmodel is
L(θP1B , x) = − lg p(x|θP1B) + (2(n− 1) + k + 2)d(g(g − 1)). (30)
One stochastic blockmodeling approach that incorporates group effects both on individual and dyadic
behavior is the p2 model (van Duijn et al. 2004). The proposed model is consistent with the p1 model in
that the behavior of actors is captured in the parameters αi and βj , but the p2 model also allows dyadic tie
density λij and reciprocity ρij to vary as well. Actor attributes αi and βj are effectively modeled under p1
as fixed-effects with no other co-variates. Under p2, these actor attributes are modeled as random effects
with a set of covariates Y1 and Y2 such that
αi = Y1iγ1 + Ai (31a)
βj = Y2jγ2 + Bj (31b)
where γ1 and γ2 are regression weights, and the residuals Ai and Bj are randomly distributed with mean 0
18
and variances σ2A and σ2
B respectively. Similarly, dyadic density and reciprocity are modeled as
λij = λ + Z1ijδ1 (32a)
ρij = ρ + Z2ijδ2 (32b)
where δ1 and δ2 are regression weights, and the reciprocity matrix Z2 is constrained to be symmetric such
that Z2ij = Z2ji.
This original specification of the p2 model, to the extent that it only allows a single set of actor attribute
vectors Y1 and Y2 and a single set of dyadic attribute matrices Z1 and Z2 can only be used to model
boundary-sensitive behavior in limited ways. However, subsequent representations of the p2 model (Zijlstra
and Duijn 2003) relax this restriction allowing individual and dyadic group effects to be modeled directly
with multiple dummy variables corresponding to multiple attribute vectors and matrices. A p2 stochastic
blockmodel that has no co-variates other than those indicating group membership then has a parameter for
each group-level parameter αr and βs, as well as a parameter for each group-level dyadic parameter λrs and
ρrs. If these parameters are subject to the constraints α+ = β+ = λr+ = λ+s = ρr+ = ρ+s = 0, then an
expression for the complexity of the p2 stochastic blockmodel can be written as
L(θP2B , x) = − lg p(x|θP2B) + (2(b− 1) + 2)d(g(g − 1)) +∑r,s
d(grs) +∑r<s
d(2grs) +∑
r
d(grr). (33)
3.3 Transitivity and Local Structure
While mechanisms that make explicit reference to category memberships can produce group structure in
social network exchange in a fairly straightforward way, there are other network process that can produce
outcomes that appear quite similar. Specifically, there are a set of local processes and structures that,
when preferentially attended to in a given network, can produce clustering and other outcomes that, on
the surface, are difficult to distinguish from structures produced by the orientation of exchange behavior
to explicit categories. While these outcomes are similar, to the extent that a researcher is interested in
identifying the process by which group structures are formed, it is critical to model network exchange in a
way that is sensitive to the potentially independent presence both of these processes and mechanisms.
A particular local structural mechanism that can lead to clustered social exchanges is a preferential bias
towards transitive closure in triads, particularly in networks that are also characterized by dyadic reciprocity.
In such systems, the establishment of a tie between a member of any existing cluster and an otherwise isolated
19
new actor will either result in that new actor being tied to all other members of the cluster, or the dissolution
of the original tie. While this local process can clearly produce clustered networks as easily as the explicit
structural mechanisms identified in the previous section, it is based on a process which operates on triads of
actors, and as such violates the assumption of dyadic independence that serves as a foundation for the p1
model and its aforementioned progeny.
While the p1 model is constrained by the assumption of dyadic independence, there are other statistical
models of graphs and networks that relax this assumption and account for a wider range of dependency
structures (Besag 1974; Frank and Strauss 1986; Pattison and Robins 2002). In the context of stochastic
models of social networks, these models are referred to as exponential random graph models (ERGM) and a
subset of these are referred to as p∗ models (Wasserman and Pattison 1996; Anderson et al. 1999; Snijders
et al. 2006). As these models are not based on the assumption of dyadic independence, they are able to
consider the effect of transitivity and other local structure on the production of group structure in network
exchange.
A general p∗ model (Wasserman and Pattison 1996; Anderson et al. 1999) predicts the observation of a
particular network as a function of an arbitrary number t of sufficient graph statistics ui(x), such that
p(x|θ) =exp{θ′u(x)}
κ(θ)=
exp{θ1u1(x) + · · ·+ θtut(x)}κ(θ)
(34)
where θ is a vector of linear model parameters κ(θ) is a normalizing constant relative to the set of all
possible networks. Assessing the structure of a given network entails identifying an appropriate set of graph
statistics u(x) that collectively are able to model all micro-structures that could lead to the observed pattern
of exchange. For instance, the set of graph statistics u(b)rs (x), measured as
u(b)rs (x) =
∑φ(i)=r,φ(j)=s,i 6=j
xij (35)
can be used to straightforwardly model the tendency of actors i in a group r to direct exchange at actors j
in a group s. Similarly, the graph statistic
u(t)0 (x) =
∑i 6=j 6=k
xijxjkxik (36)
to the extent that it measures the observed number of transitive triples i, j, k in a graph, can be used as a
measure of transitivity.
20
While the expressive range of ERGMs is particularly useful for modeling group structure in social net-
works, the empirical assessment of these models can be difficult. Modeling the effect of graph constructs like
transitivity that are based on higher-order structures like triads requires that other lower-order structures
(two-stars, dyads) that are embedded in these higher-order structures must be accounted for as well. More-
over, the analytic and empirical evaluation of Equation 34 is difficult, due principally to the intractability
of determining κ(θ). A number of authors have proposed a pseudo-likelihood logit estimator (Strauss and
Ikeda 1990; Wasserman and Pattison 1996; Anderson et al. 1999) to address this problem. However, many
of the properties of even this estimator are not fully-determined, and it is known to be unstable (Snijders
2002; Snijders et al. 2006), particularly with respect to graph statistics corresponding to these lower-order
structures.
In order to address some of these difficulties, an alternative set of graph statistics have been proposed for
ERGMs that are able to model transitivity and associated local structures without necessarily being subject
to these instability issues in estimation (Hunter and Handcock 2006; Snijders et al. 2006; Hunter 2007).
The estimation instability is in part attributed to the fact that models based on first-order graph statistics
place high probability on graphs with high indegree and outdegree. One general solution to this problem
is to use models that include parameters that place decreasingly less weight on the observation of graphs
with high degree. One way to do this is to use a set of weights that decrease geometrically by a factor α.
Graph statistics based on this idea are typically referred to as geometrically weighted measures. Following
this logic, Snijders et al. (2006) define a pair of measures u(od)α (x) and u
(id)α (x) to assess the extent to which
the tendency of actors to have large out-degrees or in-degrees respectively characterize the structure of a
network. These measures are defined as
u(od)α (x) =
n∑i=1
e−αxi+ (37a)
u(id)α (x) =
n∑j=1
e−αx+j , (37b)
where α > 0 is a parameter controlling the geometric rate of decrease in the weights.
While these statistics address the problem of estimability, they introduce α as a variable that must be
chosen by a modeler, rather than as a parameter that can be estimated. Estimating the geometric weighting
factor as well requires that the class of ERGMs be expanded to include curved exponential families, only some
subset of which can be estimated (Hunter and Handcock 2006; Hunter 2007). Within this set of constraints,
21
Hunter (2007) proposes a modified version of the geometrically weighted degree statistics as
u(od)α (x; θs) = eθs
n−1∑i=1
{1− (1− e−θs)i
}ODi(x) (38a)
u(id)α (x; θs) = eθs
n−1∑i=1
{1− (1− e−θs)i
}IDi(x), (38b)
where ODi(x) is the number of actors in a network that have outdegree i, and IDi(x) is the number of
actors in a network that have indegree i.
Applying the idea of geometrically weighted statistics to the modeling of the dependence of network
structure on degree counts is a necessary precursor to modeling transitivity—the construct of particular
interest in this article. The tendency for actors to disproportionally attract attention or direct attention
toward others is modeled in the former case by counts of k-outstars and k-instars, respectively. By a similar
logic, transitivity can be modeled by accounting for the presence and effect of k-transitive triangles and k-
independent directed twopaths. Snijders et al. (2006) propose a set of geometrically weighted graph statistics
to do this as
u(t)η (x) = η
∑i 6=j
xij
{1−
(1η
)L2ij}
(39a)
u(p)η (x) = η
∑i 6=j
{1−
(1η
)L2ij}
, (39b)
where η = eα/(eα − 1) and
L2ij =∑
k 6=i,j
xikxkj (40)
is the number of directed two-paths from i to j. The implied corresponding statistics in the curved exponential
family are then
u(t)η (x; θt) = eθt
n−2∑i=1
{1− (1− e−θt)i
}T3Ti(x) (41a)
u(p)η (x; θp) = eθp
n−2∑i=1
{1− (1− e−θp)i
}I2Pi(x), (41b)
where T3Ti(x) is equal to the number of transitive three-triangles in x and I2Pi is equal to the number of
3-independent two-paths in x (Snijders et al. 2006).
An ERGM that includes all four of these parameters can be used to distinguish the effect of transitivity
22
u(t)η from each of these other effects. Each of these four parameters is derived based on the value of all n(n−1)
ties in the network. A fully specified ERGM would also include a set of u(b)rs parameters, each of which would
be determined strictly by the grs tie values within a given block pair. Accordingly, the complexity of an
ERGM specified with all of these parameters can be determined as
L(θERGM ) = − lg p(x|θERGM ) +∑r,s
d(grs) + 4d(n(n− 1)). (42)
The complexity of a submodel of this ERGM that does not include micro-structural effects but only includes
block pair statistics u(b)rs can be determined by dropping the last term of Equation 42.
Each of the mechanisms identified in this section can be included in a stochastic model to predict the
likelihood of observing a given set of network exchanges x. The parameterization of these mechanisms
can likewise be used to numerically determine the model specificity L(θ), such that competing models can
be compared. While most of the mechanisms identified here are independent of one another and could in
theory be modeled simultaneously, in the following section I restrict attention to a comparison of one explicit
categorical structure—dyadic group-level interaction—and transitivity as a local structural mechanism.
4 Empirical Applications
In this section I present two empirical analyses in order to demonstrate the utility of the algorithmic com-
plexity approach in the assessment of different kinds of structure in network data. The first data set is
derived from an analysis of friendship patterns among sixth-grade boys and girls (Hansell 1984). A second
data set is derived from affect relations between monks in a monastery (Sampson 1968). As such, these
data sources are appropriate applications for the methods presented here for two principal reasons. The
first is that each of these data sets have been extensively analyzed in earlier methodological studies of the
statistical structure of social networks (Holland and Leinhardt 1981; Wang and Wong 1987). These studies
provide a reasonable set of baseline expectations against which the results presented here can be evaluated.
Secondly, the empirical content of these networks are particularly relevant context in which to explore the
question of whether or not group-like structures are the result of externally-imposed categories or micro-level
exchange processes. The network derived from the first data set represents a context in which it would be
perfectly reasonable to expect that gender might serve as an external characteristic that would structure
friendship relations, even in the presence of tendencies toward transitive closure. The structuring mechanism
in the second network, however, is less clear. While there external status markers may have played a role in
23
determining group structure among the monks, it is decidedly less clear what these markers may have been,
which makes this second case an empirically interesting context for the question put forward here.
4.1 Implementation
The examples described below are presented with the intention of replicating existing studies as closely as
possible, making only the modifications necessary to implement the algorithmic complexity approach to
blockmodel selection. To this end, the p1 blockmodeling approach (Wang and Wong 1987) was used in the
analysis of both data sets. ERGM estimates were performed using statnet (Handcock et al. 2005).
Estimation of the p1 stochastic blockmodel parameters entails solving for the estimated counts mij , aij
and nij , following the iterative scaling procedure outlined by Wang and Wong (1987). These estimated values
are used to produce estimates of αi, βj and λrs for comparison purposes. Then, the estimated parameters are
used to determine estimates of the dyad probabilities Dij , which are in turn used to estimate the probability
of the entire observed network in terms of the blockmodel p(x|θ). This network probability is then used to
determine a description length L(x, θ) for each blockmodel analyzed, and comparisons can be made between
these description lengths. Estimates for the dyad probabilities Dij where i < j are determined directly from
the parameter estimates mij , aij and nij using Equations 11. These dyad probabilities estimates Dij are
multiplied together to determine the probability of the observed network given the blockmodel p(x|θ)
p(x|θ) =∏i<j
Dij (43)
This estimate of p(x|θ) is used in Equation 30 to determine the description length L(θP1B , x) of a p1
blockmodel.
While the above discussion illustrates a theoretical rationale for preferring the use of geometrically
weighted transitive closure statistics to the first-order transitive triadic closure statistic u(t)0 (x) in mod-
eling ERGMs, the current implementation of statnet does not support this new measure. As such, while
the ERGM analyses presented here do employ geometrically weighted indegree and outdegree statistics, they
do use this first-order transitivity statistic. The impact of this choice, however, is mitigated by the fact that
the geometrically weighted degree statistics contribute to the stability of the estimation process.
24
4.2 Assessing Structure in Hansell’s Classroom
The analysis of the classroom data (Hansell 1984) by Wang and Wong (1987) poses the question of whether
a model that considers the gendered relationships of the classroom members provides a better description
of the friendship ties than one that does not. The specific two-position blockmodel they test differentiates
between same-sex ties and opposite-sex ties, in addition to modeling the expansiveness and attractiveness
of each actor. Here, I apply the algorithmic complexity approach to answer that same question, specifically
to explore the possibility that a more accurate description is not necessarily a better description. In a
subsequent analysis, I also explore the question of whether or not group structure is a function of behavior
being shaped by explicit categories, or rather the result of micro-structural transitive closure processes.
The p1 stochastic blockmodel analysis begins with the estimation of the parameters mij , aij and nij
for the n = 27 actors in the network. Following Wang and Wong (1987), these parameters are estimated
using an iterative scaling process. These estimates are then used to compute estimates of αi, βj , and the
λrs. The estimates were determined by solving the system of linear equations defined by Equation 28 using
the method of Gaussian elimination. The iterative scaling procedure used to determine the estimates of
mij , aij and nij fit the block marginals of these parameters, so the αi, and βj determined here express
expansiveness and attractiveness are relative to the choice tendencies expressed by the block structure.
Model 1 is a baseline model only including these expansiveness and attractiveness parameters with no group
structure, while Model 2 explicitly estimates choice density between and across gender. The results of this
estimation process are presented in Table 1. The estimated overall choice density for same-sex blocks is
λ00 = λ11 = −0.523, and the estimated overall choice density for opposite-sex blocks is λ10 = λ01 = −2.308.
The next step in the process involves estimating the probabilities of the observed dyads in terms of the
model p(Dij |θ). A probability p(Dij |θ) is estimated for every dyad in the network, and these probabilities
are multiplied together in order to estimate the probability of the observed network given the blockmodel
p(x|θ), as indicated by Equation 43.
In order to estimate the model likelihood L(θ), the optimal precision for the model parameters must be
determined. The sample size for these parameters concerns the n(n−1) = 702 tie strengths to be estimated.
Using Equation 5 I determine that the optimal precision for parameters to be
d(702) =lg 702
2=
9.4552
= 4.727. (44)
25
Table 1: Maximum Likelihood Estimates of αi and βj from Hansell (1984) DataModel 1 Model 2
Student αi βj αi βj
1 -0.038 0.354 0.413 0.1102 -1.23 0.261 -1.474 -0.5533 0.33 -0.745 0.413 -1.4994 1.627 -0.305 1.459 -0.8365 1.478 -0.305 1.848 0.5316 1.093 -0.549 1.049 0.0517 -0.216 -0.908 -0.519 -1.8098 0.598 -0.745 0.561 -1.4999 -0.719 0.190 -0.757 -0.20010 −∞ 0.629 −∞ 0.22411 -2.492 -1.049 -2.295 -0.87412 -0.883 -0.981 -0.633 -0.79613 -1.23 0.118 -1.474 -0.28514 0.308 0.958 0.000 0.91115 -0.642 1.546 -0.315 1.42116 2.995 0.865 3.439 0.94917 -2.018 1.368 -1.609 1.34918 -2.018 0.701 -1.694 1.34919 1.33 -0.434 1.317 -0.02420 1.607 −∞ 1.224 −∞21 -0.28 0.865 -0.510 0.83822 0.547 -0.651 0.503 -0.18823 0.432 -0.745 -0.163 -0.26124 -1.034 0.701 -1.694 0.83825 0.457 0.958 0.909 0.91126 −∞ -1.049 −∞ -0.32927 −∞ -1.049 −∞ -0.329
26
Table 2: Blockmodel Measures Applied to Hansell (1984) Data# Positions G2
θ lg(p(x|θ)) L(θ) L(x, θ)Model 1 1 153.6 -372.7 255.3 627.9Model 2 2 132.1 -328.3 260.0 588.3
Given this result, the determination of L(θ) is straightforward, following directly from Equation 30.
L(θP1B) = (2(n− 1) + k + 2)d(g(g − 1))
= (2(27− 1) + 1 + 2)d(702)
= 55× 4.7276
= 260.02
(45)
The final step is to sum the description length of the model L(θ) with the description length of the
network in terms of the model L(x|θ). A summary of these calculations for both the one and two-position
blockmodels is presented in Table 2, along with the G2θ statistic for each model.
Wang and Wong (1987) argue that the use of blocking information can improve the fit of p1 models.
For reasons argued above, there are many definitions of goodness-of-fit for which this claim is by definition
true. A model with more parameters and thus higher complexity should more accurately predict a set of
observed data than a less complex model. The results presented in Table 2 suggest that the cost of the model
complexity introduced by considering the gender relations of the schoolchildren in the 2-position p1 model
is more than made up for by the gains made in terms of model precision and accuracy. The G2θ statistic
proposed by Wasserman and Faust (1994) also suggests that the 2-position model more accurately represents
the data, as expected.
The algorithmic complexity of the p1 stochastic blockmodels of the classroom data suggests that the
friendship patterns of the sixth-graders are structured in a way that is consistent with their gender differences.
However, this analysis cannot on its own rule out the possibility that transitive closure plays an important
if not primary role in structuring these friendship relations. In order to explore this possibility, I analyze a
series of models that test the separate and combined effect of gender and transitivity in structure friendship
relations in this population.
Model 3 is a baseline ERGM model analogous to the the 1-position undifferentiated p1 model analyzed
above. While this model does not include separate attractiveness and productivity parameters αi and βj , it
does include the geometrically weighted indegree and outdegree graph statistics u(id)α (x; θs) and u
(od)α (x; θs).
27
Table 3: ERGM Parameter Estimates for Hansell (1984) DataModel 3 Model 4† Model 5 Model 6†
Mutuality -0.080 -0.028 -0.633 -0.220(0.740) (0.014) (0.794) (0.996)
Indegree -6.034 -1.184 -6.472 -1.383(3.779) (0.014) (13.110) (1.443)
Outdegree -16.090 -1.206 -13.045 -1.484(2.836) (0.014) (19.525) (1.403)
Transitivity 0.216 0.274(0.838) (0.124)
Gender Match 1.104 0.785(0.819) (0.367)
Choice 21.200 0.303 18.325 0.282(2.438) (0.014) (15.832) (2.347)
− lg(p(x|θ)) 659.40 559.49 617.72 542.04L(θ) 17.41 21.64 22.14 26.37L(x, θ) 676.81 581.13 639.86 568.41†parameter estimation generated degenerate samples
While these statistics do not explicitly model individual variation in the likelihood of engaging in social
behavior, they do allow networks to be modeled in which there is individual variation across these tendencies.
In as much as this does not add a pair of new model parameters for each additional actor in a system, it can
be a more efficient if not more realistic way to model this potential variance. The estimated total complexity
of Model 3 is somewhat higher than that of Model 1, suggesting that individual variance may in fact be
important in this small network.
Model 4 expands upon Model 3 by introducing a graph statistic for transitivity (ut(x) - the number
of transitive triads). While the degenerate samples generated during parameter estimation raises some
concern about the accuracy of the estimates, this model results in a substantial decrease in total complexity.
Model 5 produces a similarly large decrease in complexity by introducing a model statistic to account for the
tendency of friendship ties to be directed between students of the same gender. Model 6 includes both of these
parameters, resulting in a substantial improvement in total complexity over both Models 4 and 5, suggesting
that both transitivity and gender as a category are important structural features of this network. These
results are largely consistent with prior expectations about friendship relations in a sixth-grade classroom.
Transitive closure is a commonly observed property of most positive social relations, but it alone does not
explain the observed pattern of group structure. Rather, these results suggest that gendered pattern of
friendship relations is not a random artifact, but may be reasonably attributed to the salience of gender as
28
Table 4: Candidate Partitions of Sampson’s MonksGroup Actors
Model 7 n/a All actors in one groupModel 8 Loyal Opposition Ramauld, Bonaventure, Ambrose, Berthold, Peter, Louis, Victor
Young Turks Winfrid, John Bosco, Gregory, Hugh, Boniface, Mark, AlbertOutcasts Amand, Basil, Elias, Simplicus
Model 9 Hangers On Ramauld, Bonaventure, AmbroseLoyal Opposition Berthold, Peter, Louis, VictorYoung Turk Leaders Winfrid, John Bosco, GregoryYoung Turks Hugh, Boniface, Mark, AlbertOutcasts Amand, Basil, Elias, Simplicus
Model 10 n/a Each actor in his own group
a category in the choices of these students.
4.3 Assessing Structure in Sampson’s Monastery
The analysis presented in the preceding section details how the algorithmic complexity approach can be
applied to an empirical problem of blockmodel selection. However, the two models proposed as candidate
blockmodels for the classroom data (Hansell 1984) by Wang and Wong (1987) cannot fully demonstrate
how the algorithmic complexity approach can effectively trade off between model accuracy and complexity.
In that example, the two-position model is both more accurate and a better model in terms of description
length. In this section, we present an analysis of a set of blockmodels for which the description length of the
model does not decrease monotonically with model accuracy.
In the analysis by White et al. (1976) of the monastery data (Sampson 1968), a number of alternative
blockmodels are proposed of varying degrees of complexity. In particular, White, Boorman and Breiger
present three- and five-position models of the social structure of the monastery corresponding to Models
8 and 9 respectively as presented here. In this section, we use the algorithmic complexity framework to
analyze those two models in addition to Model 7—a model in which actors are undifferentiated by position
and Model 10—a model in which each monk is assigned to his own position. Table 4 displays the monks
assigned to each position in each of these models.
As in the preceding section, we apply the p1 stochastic blockmodel (Wang and Wong 1987) as an un-
derlying probability distribution. Given this approach, we are left with the choice of how to assign the λrs
parameters to equivalence groups. Wang and Wong propose a number of methods by which this can be
done, each method representing a different specific hypothesis about the basic structure of within-position
and inter-position ties. In their original analysis, White, Boorman and Breiger do not assume a particular
29
Table 5: Maximum Likelihood Estimates of αi and βj from Sampson (1968) DataModel 7 Model 8 Model 9
Monk αi βj αi βj αi βj
Ramauld 1.039 −∞ -0.346 −∞ 3.782 −∞Bonaventure -0.603 1.391 -1.015 0.024 -0.329 0.970Ambrose -0.348 0.546 -1.314 -1.741 -0.329 0.970Berthold 0.019 -0.914 -1.022 -3.079 -1.495 -6.373Peter -0.348 0.546 -1.314 -1.741 -1.203 -1.446Louis 0.019 -0.552 -1.022 -3.350 -1.495 -3.500Victor 0.318 -0.552 -1.227 -3.350 -1.900 -3.500Winfrid -0.603 1.391 -0.734 1.678 -0.112 2.213John Bosco -0.185 0.546 -0.802 0.009 -0.112 5.084Gregory -0.603 1.391 -0.734 1.678 -0.112 2.213Hugh 0.318 -0.552 -0.802 0.009 -1.495 -0.348Boniface 0.318 -0.914 -0.802 -0.660 -1.495 -4.459Mark -0.603 0.649 -0.802 1.108 -1.355 -0.348Albert 0.318 -0.552 -0.802 0.009 -1.495 -0.348Amand 0.318 -0.552 2.876 2.292 2.286 1.388Basil 0.496 -0.335 4.043 2.370 2.286 2.665Elias 0.318 -0.985 2.876 2.449 2.286 2.335Simplicius -0.185 -0.552 2.944 2.292 2.286 2.486
Table 6: Blockmodel Measures Applied to Sampson (1968) Data# Positions G2
θ lg(p(x|θ)) L(θ) L(x, θ)Model 7 1 85.1 -170.9 148.6 319.5Model 8 3 47.6 -116.0 165.1 281.2Model 9 5 42.5 -107.1 214.7 321.8Model 10 18 0.0 0.0 1341.8 1341.8
structure of zero-blocks and one-blocks. In an effort to closely model their analysis, we present results for
the fully-saturated p1 blockmodel, that allows the maximum flexibility in determining λrs for each block.
Table 5 presents the estimates αi and βj for each model for which these estimates could be determined.
Estimates of αi and βj could not be determined for Model 10, as the system of equations needed to determine
these values is underdetermined as suggested in Section 4.1. However, estimates of m, a and n can be
determined for all of these matrices using the iterative scaling procedure, so the tie probabilities pij and the
likelihood p(x|θ) were estimated for all four blockmodels. The estimates of p(θ), p(x|θ) and L(x, θ) for each
model are presented, along with the corresponding G2 statistic, in Table 6.
These results more clearly demonstrate how a model selection criterion based on algorithmic complexity
can effectively be used to identify boundaries in a social system. As the number of blocks grows, the
precision and accuracy of the model grows, as measured both by p(x|θ) and G2. At the same time, the model
30
complexity, as measured by L(θ) grows as well. The description length L(x, θ) presented here suggests that
the three-position Model 9 is the best-fit model of the four considered here, which matches the observational
intuition of Sampson (1968).
While these results are helpful in locating the boundaries between the actors in this social system, like
the p1 analysis of the classroom presented above, they cannot usefully speak to the extent to which the
apparent grouping has to do with categorical distinctions per se. While there is a reasonable a priori reason
to expect that sixth-graders might shape their social interactions along pre-existing categorical lines (e.g.
gender), there is somewhat less of a reason to expect the monks studied by Sampson (1968) to do the same.
Models 11-16 address this question by examining the separate and combined relevance of transitivity and
group structure to affect in this social system. The results of each of these models are presented in Table 7.
Model 11, like Model 3 in the prior analysis, is a baseline ERGM model of affect relations in the monastery.
Like the earlier ERGM models, this model does not estimate individual-level productivity and attractiveness,
but rather examines the general relevance of differences in the production and receipt of affect to structure
the entire network. While the estimated likelihood of Model 11 is somewhat lower than that of Model 7, it
is noteworthy that the total complexity is lower, suggesting that specific individual attributes relatively less
important in this context. It is also noteworthy that estimated coefficient mutuality is consistently significant
in all six models examined here.
Model 12 introduces a statistic for transitivity, which results in a prediction −lg(p(x|θ)) that is signifi-
cantly higher than Model 11, as well as a very significant reduction in model complexity. This is particularly
interesting in comparison to Model 13, which introduces model statistics sensitive to the dyad of positions
that characterize a pair of actors. The total complexity for Model 13 is somewhat higher than that of Model
12, suggesting that transitivity is more important than group identities in explaining the overall pattern of
interaction. This result does not suggest that there are not meaningful boundaries around the three groups,
but rather a likelihood that these boundaries are the result of the micro-process of transitive closure rather
than result of some salient social characteristic.
The salience of the boundaries around the three groups is reflected in a comparison of the total complexity
of Models 11, 13, and 15, and in a comparison of the total complexity of Models 12, 14, and 16. In each of
these cases, the general pattern of a three-group model being preferred to a one-group or five-group model
is shown as it was in the p1 stochastic blockmodel analysis presented above.
31
Table 7: ERGM Parameter Estimates for Sampson (1968) DataModel 11 Model 12 Model 13 Model 14 Model 15 Model 16
Mutuality 2.590 2.245 1.363 1.340 1.192 1.082(0.497) (0.033) (0.567) (0.569) (0.584) (0.590)
Indegree -1.877 -2.293 -2.727 -3.639 -2.682 -3.805(0.921) (0.033) (1.043) (1.265) (1.047) (1.253)
Outdegree 1.788 -0.007 2.081 2.330 1.544 2.210(2.753) (0.033) (4.315) (7.447) (3.318) (8.004)
Transitivity 0.258 -0.206 -0.287(1.591) (0.145) (0.162)
3-Block Effects Y Y5-Block Effects Y Y
Choice -2.092 2.727 -3.179 -2.427 -2.951 -2.383(2.727) (0.033) (4.503) (7.968) (3.571) (8.214)
− lg(p(x|θ)) 222.42 174.34 180.67 177.26 173.07 168.10L(θ) 15.01 19.14 31.31 35.42 46.32 50.45L(x, θ) 237.43 193.48 211.98 212.70 219.39 218.55
5 Discussion
Given the interest on the part of a wide range of social scientists in social boundaries and the processes that
attend to them, it is rather surprising that so little attention has been paid to their empirical identification.
One of goals of sociological methods should be to provide tools that can be used with empirical data to either
support or call into question sociological theory and constructs. If boundaries figure centrally into theories
of structure, categories and type, then it is particularly important to develop methods that can empirically
establish their presence and significance. To the extent that a method can not in principle provide evidence
that boundaries fail to play a significant role in a particular system, that method cannot in general be used in
other cases to support the claim that they do. The blockmodeling technique proposed by White et al. (1976)
and subsequent methods provide tools that can effectively be used to elicit possible social structures from
network data, and as such can be quite useful for supporting theories concerning roles and positions in social
groups. We have few tools, however, that can establish, for example, that boundaries are unimportant in a
particular social network, or that the emergence of group structure is attributable to some other micro-level
process.
To this end, in this article I have argued that analyses of social networks based on the idea of algorithmic
complexity are uniquely qualified in terms of their ability to identify social structure. Not only can the method
32
outlined here generate falsifiable hypotheses about the presence or absence of structure, but it clearly relates
the selection of a particular model to specific claims about both the fit of the model to the observed network
and to the fundamental likelihood of observing the model itself. The underlying probabilistic nature of these
claims can moreover be used to show how a model selection framework based on the idea of algorithmic
complexity compares favorably to model selection approaches such as the AIC (Akaike 1974) and the BIC
(Raftery 1995). An additional feature of the approach outlined here is that it can be used with any stochastic
model of social network exchange. An implication of this is that, to the extent that models like the ERGM
(Snijders et al. 2006) can simultaneously model both explicit group structure and group structure produced
by local exchange processes, the significance of both of these potential antecedents of group structure can
independently be evaluated.
While the approach outlined here is reasonably flexible in its ability to incorporate existing statistical
models of social network exchange, it is nevertheless limited in its ability to assess those sociologically
interesting forms of group structure that are yet to be represented in a stochastic network model. Particularly
noteworthy among these is the idea of regular equivalence (White and Reitz 1983), a generalization of
structural equivalence in which actors are judged to be equivalent if they have a similar pattern of relations
to other equivalent actors, rather than similar relations to exactly the same actors. This type of equivalence
structure is more characteristic of many common identity-based relations (parent-child, employer-employee,
core state-peripheral state) than is structural equivalence. However, while structural equivalence can be
straightforwardly captured in a statistical model of network exchange as demonstrated in the models reviewed
here, measures of regular equivalence have as yet failed to appear in these models. If group structure in
a particular social system is in fact driven by regular equivalence rather than structural equivalence, a
straightforward application of the methods presented here could lead to misleading results.
Limitations of the ability to model particular features notwithstanding, the approach presented here
represents a helpful step forward in the statistical analysis of structure in social networks. The empirical
examples presented here demonstrate not only how the method can be used to identify boundaries in a social
system, but also, as in the Sampson’s monastery example, how group structure can be identified that may
be the result of the local structure of exchange rather than that of explicit boundaries. While the results
presented here are consistent with those presented in earlier research, these can be distinguished by the
relative strength of the statistical basis upon which they are based. There has been a relative paucity of
recent research in which the empirical identification of boundaries has played a central role. If this can be
explained by the lack of a solid theoretical and methodological basis upon which to base such research, then
33
the algorithmic complexity approach presented here should serve as a useful contribution.
References
Abbott, Andrew and Alexandra Hrycak, 1990. “Measuring Resemblance in Sequence Data: An Optimal
Matching Analysis of Musicians’ Careers.” American Journal of Sociology 96:144–185. ISSN 0002-9602.
Akaike, Hirotugu, 1974. “A New Look at the Statistical Model Identification.” IEEE Transactions on
Automatic Control 19:716–623. ISSN 0018-9286.
Alderson, Arthur S. and Jason Beckfield, 2004. “Power and Position in the World City System.” American
Journal of Sociology 109:811–851.
Anderson, Carolyn J., Stanley Wasserman, and Bradley Crouch, 1999. “A p∗ Primer: Logit Models for
Social Networks.” Social Networks 21:37–66.
Anderson, Carolyn J., Stanley Wasserman, and Katherine Faust, 1992. “Building Stochastic Blockmodels.”
Social Networks 14:137–161.
Besag, Julian, 1974. “Spatial Interaction and the Statistical Analysis of Lattice Systems.” Journal of the
Royal Statistical Society. Series B (Methodological) 36:192–236. ISSN 0035-9246.
Burnham, Kenneth P. and David R. Anderson, 2004. “Multimodel Inference: Understanding AIC and BIC
in Model Selection.” Sociological Methods and Research 33:261–304. doi:10.1177/0049124104268644.
Carrington, Peter J. and Greg H. Heil, 1981. “COBLOC: A Hierarchical Method for Blocking Netowrk
Data.” Journal of Mathematical Sociology 8:103–131.
Carrington, Peter J., Greg H. Heil, and Stephen D. Berkowitz, 1979. “A Goodness-of-fit Index for Block-
models.” Social Networks 2:219–234.
Chaitin, Gregory J., 1966. “On the Length of Programs for Computing Finite Binary Se-
quences.” Journal of the Association for Computing Machinery 13:547–569. ISSN 0004-5411. doi:
http://doi.acm.org/10.1145/321356.321363.
Fienberg, Stephen E. and Stanley Wasserman, 1981. “Categorical Data Analaysis of Single Sociometric
Relations.” In “Sociological Methodology 1981,” , edited by Samuel Leinhardt, pp. 156–192. San Francisco:
Jossey-Bass.
34
Frank, Ove and David Strauss, 1986. “Markov Graphs.” Journal of the American Statistical Association
81:832–842. ISSN 0162-1459.
Gerlach, Michael L., 1992. “The Japanese Corporate Network: A Blockmodel Analysis.” Administrative
Science Quarterly 37:105–139.
Han, Shin-Kap and Phyllis Moen, 1999. “Clocking Out: Temporal Patterning of Retirement.” American
Journal of Sociology 105:191–236. ISSN 0002-9602.
Handcock, Mark S., David R. Hunter, Carter T. Butts, and Martina Morris, 2005. “statnet: An R Package
for the Statistical Analysis and Simulation of Social Networks.”
Hansell, Stephen, 1984. “Cooperative Groups, Weak Ties, and the Integration of Peer Friendships.” Social
Psychology Quarterly 47:316–328.
Holland, Paul W., Kathryn B. Laskey, and Samuel Leinhardt, 1983. “Stochastic Blockmodels: Some First
Steps.” Social Networks 5:109–137.
Holland, Paul W. and Samuel Leinhardt, 1981. “An Exponential Family of Probability Distributions for
Directed Graphs.” Journal of the American Statistical Association 76:33–65.
Hunter, David R., 2007. “Curved Exponential Family Models for Social Networks.” Social Networks (forth-
coming) .
Hunter, David R and Mark S Handcock, 2006. “Inference in Curved Exponential Family Models for Net-
works.” Journal of Computational and Graphical Statistics 15:565–583.
Kolmogorov, Andrey N., 1965. “Three approaches to the quantitative definition of complexity.” Problems
in Information Transmission 1:4–7.
Krackhardt, David, 1987. “Cognitive Social Structure.” Social Networks 9:109–134.
Kuha, Jouni, 2004. “AIC and BIC: Comparisons of Assumptions and Performance.” Sociological Methods
and Research 33:188–229. doi:10.1177/0049124103262065.
Lamont, Michele and Virag Molnar, 2002. “The Study of Boundaries in the Social Sciences.” Annual Review
of Sociology 28:167–195.
Laumann, Edward O., Joseph Galaskiewicz, and Peter V. Marsden, 1978. “Community Structure as Interor-
ganizational Linkages.” Annual Review of Sociology 4:455–484. doi:10.1146/annurev.so.04.080178.002323.
35
Lorrain, Francois P. and Harrison C. White, 1971. “Structural Equivalence of Individuals in Social Networks.”
Journal of Mathematical Sociology 1:48–80.
Nowicki, Krzysztof and Tom A. B. Snijders, 2001. “Estimation and Prediction for Stochastic Blockstruc-
tures.” Journal of the American Statistical Association 96:1077–1087.
Panning, William H., 1982. “Fitting Blockmodels to Data.” Social Networks 4:81–101.
Pattison, Philippa and Garry Robins, 2002. “Neighborhood-Based Models for Social Networks.” Sociological
Methodology 32:301–337. ISSN 0081-1750.
Phillips, Damon J. and Ezra W. Zuckerman, 2001. “Middle-Status Conformity: Theoretical Restatement
and Empirical Demonstration in Two Markets.” American Journal of Sociology 107:379–429.
Raftery, Adrian E., 1995. “Bayesian Model Selection in Social Research.” In “Sociological Methodology
1995,” , edited by Peter V. Marsden, pp. 111–196. San Francisco: Jossey-Bass.
Rissanen, Jorma, 1983. “A Universal Prior for Integers and Estimation by Minimum Description Length.”
The Annals of Statistics 11:416–431. ISSN 0090-5364.
———, 1989. Stochastic Complexity in Statistical Inquiry. Teaneck, N.J.: World Scientific.
Sampson, Samuel F., 1968. A Noviate in a Period of Change: An Experimental and Case Study of Relation-
ships. Ph.D. thesis, Cornell University.
Shannon, Claude E., 1948. “A Mathematical Theory of Communication.” Bell System Technical Journal
27:379–423.
Snijders, Tom A. B., 2002. “Markov Chain Monte Carlo Estimation of Exponential Random Graph Models.”
Journal of Social Structure 3.
Snijders, Tom A. B. and Krzystof Nowicki, 1997. “Estimation and Prediction for Stochastic Blockmodels
for Graphs with Latent Block Structure.” Journal of Classification 14:75–100. ISSN 0176-4268. doi:
10.1007/s003579900004.
Snijders, Tom A. B., Philippa E. Pattison, Garry L. Robins, and Mark S. Handcock, 2006. “New
Specifications for Exponential Random Graph Models.” Sociological Methodology 36:forthcoming. doi:
10.1111/j.1467-9531.2006.00171.x.
36
Snyder, David and Edward L. Kick, 1979. “Structural Position in the World System and Economic Growth,
1955-1970: A Multiple-Network Analysis of Transnational Interactions.” American Journal of Sociology
84:1096–1126. ISSN 0002-9602.
Solomonoff, Raymond J., 1964. “A Formal Theory of Inductive Inference, Part 1 and Part 2.” Information
and Control 7:224–254.
Stine, Robert A., 2004. “Model Selection Using Information Theory and the MDL Principle.” Sociological
Methods and Research 33:230–260. doi:10.1177/0049124103262064.
Stovel, Katherine, Michael Savage, and Peter Bearman, 1996. “Ascription into Achievement: Models of
Career Systems at Lloyds Bank, 1890-1970.” American Journal of Sociology 102:358–399. ISSN 0002-
9602.
Strauss, David and Michael Ikeda, 1990. “Pseudolikelihood Estimation for Social Networks.” Journal of the
American Statistical Association 85:204–212. ISSN 0162-1459.
van Duijn, Marijtje. A. J., Tom A. B. Snijders, and Bonne J. H. Zijlstra, 2004. “p2: a Random Effects Model
with Covariates for Directed Graphs.” Statistica Neerlandica 58:234–254.
Wallace, Christopher S. and David M. Boulton, 1968. “An Information Measure for Classification.” Computer
Journal 11:185–194.
Wallace, Christopher S. and David L. Dowe, 1999. “Minimum Message Length and Kolmogorov Complexity.”
The Computer Journal 42:270–283.
Wang, Yuchung J. and George Y. Wong, 1987. “Stochastic Blockmodels for Directed Graphs.” Journal of
the American Statistical Association 82:8–19.
Wasserman, Stanley and Katherine Faust, 1994. Social Network Analysis: Methods and Applications. Cam-
bridge: Cambridge University Press.
Wasserman, Stanley and Phillipa Pattison, 1996. “Logit Models and Logistic Regression for Social Networks:
I. An Introduction to Markov Graphs and p∗.” Psychometrika 61:401–425.
White, Douglas R. and Karl P. Reitz, 1983. “Graph and Semi-group Homomorphisms on Networks of
Relations.” Social Networks 6:193–235.
37
White, Harrison C., Scott C. Boorman, and Ronald L. Breiger, 1976. “Social Structure from Multiple
Networks. I. Blockmodels of Roles and Positions.” American Journal of Sociology 81:730–779.
Zijlstra, Bonne J. H. and Marijtje A. J. Duijn, 2003. Manual p2. Version 2.0.0.7. iec ProGAMMA/University
of Gronigen, Gronigen.
38