Algorithmic Complexity and Structural Models of Social ...scripts.mit.edu/~cwheat/research/modelsel.20070416.pdf · 4/16/2007 · 1 stochastic blockmodels and exponential random

Algorithmic Complexity and Structural Models of Social Networks∗

Christopher WheatMIT Sloan School of Management

50 Memorial DriveCambridge, MA 02142-1347

[email protected]

April 16, 2007

Abstract

This article explores how the algorithmic complexity approach can be used to address the problem ofidentifying group structures in social networks. A specific implementation of the algorithmic complexityapproach based on the principle of minimum description length (MDL) is compared to other modelselection criteria, and compared and contrasted with a Bayesian approach to model selection. Themethod presented here provides a statistical basis for determining how many groups actors in a givennetwork should be partitioned into. Additionally, this article explores the analysis of two independentmechanisms by which group structures might be produced in social networks—those associated withexplicit categories and those associated with preferential attention to particular local structures. I outlinea method for using p1 stochastic blockmodels and exponential random graph models (ERGMs) in thecontext of the identified algorithmic complexity approach to address this question, and demonstrate themethod in two empirical settings.

1 Introduction

Much of contemporary sociology is rooted in two simple premises about the nature of social structure. The

first is that some structures can be most usefully understood in terms of the social categories that actors are

assigned to. A related second premise is that actors meaningfully attend to the boundaries between these

categories (Lamont and Molnar 2002). For instance, the argument that career paths are best understood

not in terms of a series of isolated job-to-job transitions, but rather as elaborations of a set of ideal typical

set of career paths (Abbott and Hrycak 1990; Stovel et al. 1996; Han and Moen 1999) is one example of the

exploration of these premises. This work develops the basic idea that actors and the institutions that embed

them treat these career paths as meaningfully typed social objects, and that once an actor is in a given

career path, it is relatively unlikely that she will cross the boundary into another. In a different context,∗I would like to thank Peter Marsden, Tiziana Casciaro, David Gibson, Joel Podolny, Nitin Norhia and Kate Kellogg for

their invaluable feedback on this and earlier versions of this work. I would also like to thank David Hunter for assistance withstatnet and curved exponential model estimation. Some of the analysis was performed with statnet 1.0 developed with supportfrom NIH grants R01DA012831 and R01HD041877.

1

Phillips and Zuckerman (2001) argue that when actors can be divided into high- and low-status categories,

high-status actors near the boundary between the two will behave in a distinctive status-preserving way. In

each of these cases, a core argument is presented based on the idea that the social behavior of an actor is

not determined strictly by her individual characteristics and motivations, but also by characteristics of and

structures around meaningful social categories.

The identification of boundaries within social systems is, as such, an issue of interest in a wide range

of sociological phenomena. That said, it presents a particularly significant opportunity for sociologists who

employ social network methods in their analyses (White et al. 1976; Laumann et al. 1978). Social network

constructs such as structural equivalence (Lorrain and White 1971) and regular equivalence (White and

Reitz 1983) provide a facility by which similarities and differences between actors can be quantitatively

measured. When used in combination with various clustering techniques, these constructs have been used to

analyze group structures within a wide range of social systems (White et al. 1976; Gerlach 1992; Alderson

and Beckfield 2004). These blockmodel analyses provide the basis of a formal method that can be used to

assess group structure in networks.

Blockmodel analyses such as these have widely been used to empirically assess group structure in net-

works. However, there is a significant shortcoming inherent in the method as it has typically been applied.

The logic of structural or regular equivalence is directly applicable to the determination of the similarity

or difference between a pair of actors, and as such represents a useful first step in the process of identify-

ing meaningful groupings. This information cannot, however, be used in isolation to identify meaningfully

bounded groups. A complete answer to this question requires a method that can be used to identify cases

in which two actors that are not precisely identical should be included within the same group, or likewise,

when two actors that are sufficiently distinct should be partitioned into different groups. Such a method is

qualitatively distinct from a method that can only measure the degree to which a particular pair of actors is

similar or different. Moreover, even to the extent that they provide some information about group structures

in social networks, standard blockmodel analyses provide little information about the processes by which

these structures are produced.

These issues can be illustrated by reviewing the analysis by White et al. (1976) of the pattern of relations

of positive affect in a monastery (Sampson 1968) as depicted in Figure 1. White and his coauthors suggest

that the monks in this study might meaningfully be divided into a number of groups, based on the pattern of

their expressed positive affect toward one another. To the extent that group memberships and the boundaries

between them play a role in shaping affect, the patterning of these relationships should reflect the underlying

2

Young Turks

Loyal Opposition

Outcasts

Figure 1: Sampson’s Monks—Affect Relations

3

Young Turks Loyal Opposition OutcastsYoung Turks 1 0 0

Loyal Opposition 0 1 0Outcasts 0 1 1

Figure 2: Image Matrix for Sampson’s Monks

structure of this social system. In particular, White et al. were interested in identifying “zeroblocks”—the

relative absence of a particular type of relation between two groups in a given system. Accordingly, they

identify a three-group structural model that mirrors the original analysis by Sampson (1968). Figure 2

illustrates the derived schematic set of relationships and their absence. The results of this blockmodel

analysis suggest, for example, that the Outcasts express positive affect toward the Young Turks, but that

the Young Turks do not reciprocate this affection.

This analysis is useful in that it effectively captures the pattern of interaction between the actors in

this social system. It is generally true, for instance, that members of the Young Turk group are unlikely to

express positive affect toward members of the Outcast group, suggesting that the boundary between these

two groups is an important one. The three-group model illustrated here is not a perfect model, however,

in that there are certainly exceptions to the rules it implies. In some sense, a more accurate model might

be obtained by decomposing the three proposed groups into five groups—an alternative model White et al.

(1976: 751-2) also propose. In fact, given that there are no two actors in this system that are completely

equivalent, the logical extension of this argument would be that the most accurate model would be one in

which there are no groups at all and each actor is assigned to his own position.

It is also noteworthy that the only explanatory variables involved in this type of blockmodel analysis are

the group membership of the two actors involved in a given type of social exchange. Both the three-group

and five-group models proposed by White et al. implicitly make the claim that the observed pattern of

exchanges is explained simply by the group memberships of each actor. While this may be the case, there

are other mechanisms besides explicit group membership and attentiveness to social boundaries, such as

tendencies toward reciprocity and transitive closure, that might produce clustered social networks. Inasmuch

as blockmodel analyses of this sort cannot account for these mechanisms and processes, they are further

constrained in their ability to make claims about the presence or absence of explicit social boundaries.

The fundamental problem illustrated by this example is that the logic of equivalence, while being quite

useful for determining the extent to which two actors are similar or different, is by itself less insufficient for

determining which differences or similarities are substantively meaningful. There are a set of goodness-of-fit

4

measures (Carrington et al. 1979; Carrington and Heil 1981; Wasserman and Faust 1994) that can be used,

along with these equivalence measures, to narrow down the total number of group-based models of social

structure that might be applied to a network of interactions. Selecting between models that have varying

numbers of explanatory groups, particularly to the extent that other social processes and mechanisms are

accounted for, requires a different approach.

One approach to identifying the presence or absence of explicit groups and boundaries in a social network

is to cast the problem in terms of model selection. To the extent that group memberships are used to predict

whether or not a given pair of actors will socially interact, models with varying numbers of subgroups

can be viewed as predictive models with varying numbers of explanatory variables. Similarly, models that

consider other cluster-producing mechanisms such as reciprocity and transitivity will include additional

predictive variables. In general, the problem of selecting between models differentiated by their use of

explanatory variables has recently attracted the interest of researchers who study social systems (Burnham

and Anderson 2004; Kuha 2004; Stine 2004). In particular, the Akaike information criterion (AIC) (Akaike

1974) and the Bayesian information criterion (BIC) (Raftery 1995) have been proposed as metrics that can be

used to establish the appropriate number of explanatory variables in the context of linear regression models.

In a similar vein, Stine (2004) proposes that the Minimum Description Length (MDL) criterion (Rissanen

1983, 1989) is a particularly useful metric for solving this problem, because it offers a more explicit mechanism

for taking account of the total set of models considered.

In this article, I argue that the MDL approach in particular and the algorithmic complexity approach in

general are not only useful for model selection in the regression context as argued by Stine (2004), but is also

particularly useful for model selection in the context of structural models of social networks. Specifically, I

argue that the boundary identification problem can be solved in this context by relying upon the explicit and

theoretically grounded trade-off that the algorithmic complexity approach makes between model complexity

and model accuracy. In the following section I present a general discussion of the algorithmic complexity

framework and how it addresses issues presented by the problem of model selection. Section 3 presents a a

specific application of the algorithmic complexity framework to model selection in the context of stochastic

statistical models of network structure. In Section 4 I demonstrate the use of the the general approach in

the analysis of group structure in two empirical examples.

5

2 Algorithmic Complexity and Model Selection

The example of Sampson’s monastery demonstrates that the task of identifying the boundaries in a so-

cial system is equivalent to determining the salient group memberships in a population that govern social

interaction. Any population of actors can potentially be divided into subgroups in a wide range of ways—

the three-group and five-group models proposed by White et al. (1976) are only two of many possibilities.

Each of these partitions of actors into groups can be conceptualized as a model of behavior that takes a

corresponding set of boundaries into account.

The consideration of the extent to which a given model of social exchange is an appropriate representation

of a set of observed behavior can be decomposed into three conceptually distinct questions. The first question

concerns the accuracy of the model—the correspondence between the behavior that it describes and the

behavior that is actually observed. The five-group model of affect in the Sampson monastery is, in this

sense, more accurate than the three-group model. A second question concerns the specificity of the model—

the extent to which a particular model is likely to be constrained to a particular set of observed possibilities.

Models which are less specific, or more generalizable, are typically thought of as being more useful models in

some general sense. So while the five-group model of affect is more accurate than is the three-group model,

it is also more specific. Inasmuch as these first two questions reflect competing concerns, a third question

concerns the issue of how to balance the model accuracy with model specificity.

In the context of the analysis of blockmodels of social networks in particular, there is a substantial body

of research that has focused primarily upon the first of these questions. Much of the early research in

this area (Carrington et al. 1979; Carrington and Heil 1981; Panning 1982; see also Wasserman and Faust

1994) focused on deterministic blockmodels—models in which empirically observed exchanges are essentially

conceptualized as equivalent to social structure rather than a realization thereof. Subsequent developments of

stochastic rather than deterministic conceptualizations of network structure models (Holland and Leinhardt

1981; Wasserman and Pattison 1996; Anderson et al. 1999; van Duijn et al. 2004) led to a general consensus

that the accuracy of a particular network model should be measured in terms of its ability to predict an

observed pattern of ties with high probability. A number of studies have applied this general logic specifically

to the problem of studying group structure using network data (Fienberg and Wasserman 1981; Holland et al.

1983; Wang and Wong 1987; Anderson et al. 1992; Snijders and Nowicki 1997; Nowicki and Snijders 2001).

While there has been considerable development in refining approaches to solving the problem associated

with the first question, there is somewhat less agreement about the second and the third. In the context

of stochastic blockmodeling, a G2 likelihood-ratio statistic has been proposed as a useful way to compare

6

multiple network models, specifically in the case where one candidate model is nested in another (Anderson

et al. 1992). Other approaches have been suggested as well (Snijders and Nowicki 1997; Nowicki and Snijders

2001), but few approaches have been proposed that offer a theoretically grounded rationale for combining a

measure of model accuracy with an assessment of the properties of the model itself.

In this section, I argue that model selection approaches based on the logic of algorithmic complexity

(Solomonoff 1964; Kolmogorov 1965; Chaitin 1966) dominate all other approaches. An approach based on

algorithmic complexity has the advantage of addressing all three of these questions in a consistent logic

grounded in probability theory and leads to a selection logic that can be clearly explained and explicitly

compared to other model selection approaches. Accordingly, after outlining the basic features of the approach,

I demonstrate how other approaches that have been recently proposed can be analyzed using the same general

framework, and discuss some of the implications of using these other selection criteria.

2.1 Algorithmic Complexity

A seminal stream of ideas in the field of information theory that has some bearing on the problem of identi-

fying structure in social exchange behavior concerns the proposition that any set of data has a measurable

amount of randomness, or conversely, non-random structure. The unique contribution of these ideas to the

problem of balancing the accuracy of a model with its specificity is the way in which this theory provides

a framework in which both of these concepts are measured in the same terms. There are several model

selection measures that are based in this tradition, and each differs from the other in subtle but important

ways. While these differences can in certain cases be meaningful, they are less significant than the way

in which these measures differ from other measures that have been commonly applied to model selection

problems in the social sciences.

The central idea in this stream of research is that abstract data can be characterized in terms of the

amount of information that it communicates—which is to say, the extent to which it is a representation of

structured phenomenon rather than random noise (Shannon 1948). To the extent that a set of observations

is in fact structured rather than completely random, the structure embedded in the data should be useful in

finding an efficient non-redundant representation of that data (Chaitin 1966; Kolmogorov 1965). Algorithmic

complexity is a measure of this size of this most-efficient representation, typically referring to the length of

the computer code that expresses the most efficient algorithm that would reproduce the original set of data.

Two extensions of this idea that are particularly relevant to the task of model selection in the social

science context are the subsequent independent developments of the Minimum Message Length (Wallace and

7

Boulton 1968) and the Minimum Description Length (Rissanen 1983, 1989) measures. While the Minimum

Message Length (MML) and Minimum Description Length (MDL) measures vary both in their theoretical

underpinnings and in some of their practical implications (Wallace and Dowe 1999), they have in common a

reliance on the decomposition of the algorithmic complexity measure into two parts, one of which refers to

the accuracy of a model, and another which refers to model specificity. Rather than requiring researchers to

focus on finding the shortest arbitrary algorithm that could reproduce a particular set of observations, this

decomposition allows the problem of model selection to be posed in a way that is compatible with existing

social science approaches to modeling structure.

Both the MDL and the MML approach measure algorithmic complexity in terms of the number of binary

digits (bits) it would theoretically take to specify both the features of a given structural model θ and a given

observation x conditional on the indicated model representing a true hypothesis, such that

L(θ, x) = L(θ) + L(x|θ), (1)

where L(θ, x) is the total complexity, L(θ) is a measure of the complexity of the model, and L(x|θ) is the

complexity of the data conditioned on the model. A key result due to Shannon is that L(x|θ) is equal to

−lg(p(x|θ)), such that the total complexity

L(θ, x) = L(θ)− lg(p(x|θ)). (2)

The similarity of this expression to Bayes’ theorem leads to a particularly useful result for model selection

in the context of stochastic models of social structure. Bayes theorem states that

p(θ|x)p(x) = p(θ)p(x|θ). (3)

For a given observation of x, p(x) is either undefined or fixed. If the model selection task in general is

characterized as finding the most likely model of structure given observed data x, then maximizing the

right hand side of Equation 3 is equivalent to maximizing p(θ|x) and explicitly solving the model selection

problem. Similarly, if we define L(θ) as −lg(p(θ)), then Equation 2 is just a binary logarithmic expression

of Equation 3.

It is, of course, non-trivial to assume that L(θ) can be defined this way, or even that prior probabilities p(θ)

can be assigned to abstract structural models—the theories underlying MML and MDL diverge precisely on

8

this point. Nevertheless, the logic of this decomposition poses a compelling question to the broader problem

of model selection. The intuitive logic of Occam’s Razor is captured in an expression with explicit grounding

in probability theory. Consider a dyad in which the presence or absence of exchange can be explained by

two candidate models of the organization in which the dyad is embedded. The model θ1 is characterized by

a single dichotomous parameter x1 ∈ {non-profit, for-profit}. A second model θ2 is characterized by both

x1 and a second dichotomous parameter x2 ∈ {large, small}. The two possible models consistent with the

one-parameter model θ1 can each be described with a single bit, and in the absence of any other information

about the organization in which the dyad is embedded, each of these models might be assigned a prior

probability of 1/2. Similarly, the four possible models consistent with the two-parameter model θ2 can each

be described with two bits and might each be assigned an equal probability of 1/4.

In general, the idea that models characterized by a smaller set of dichotomous binary parameters should

be less likely than those characterized by a larger set can be expanded to cover models with more expres-

sive parameters. Rissanen (1983) proposes a prior for unbounded integral parameters, in which the prior

probability for an natural number z is defined as

L0(z) = lg∗(z) + lg c. (4)

In this expression, lg∗(x) = lg x+lg lg x+· · · , where the sum includes only the positive terms of the sequence,

and the constant c =∑

2−lg∗(n) ' 2.865064. Integers (that is, positive and negative natural numbers) can

be encoded by adding a single bit to indicate the sign of the number.

Real-valued parameters pose a potential problem to using a probabilistic approach for model selection.

The uncountability of real numbers or even interval-subsets of R means that assigning any positive probability

to every possible real-value of a model parameter would violate the law of total probability. As such, in order

to assess the probability of a real-valued parameter, they must be mapped to some countable subset. Rissanen

(1989) approaches this problem by arguing that for n observations of data, there is an optimal precision d,

such that the parameter should be limited to taking on 2d values. This precision can be determined as

d(n) =lg n

2+ cd, (5)

where cd is a negligible constant related to the curvature of the description length function (Rissanen 1989:

56).

Inasmuch as most stochastic blockmodels can be represented with either integral or real-valued inde-

9

pendent parameters, these results can be straightforwardly applied to the task of blockmodel selection, and

accordingly, to the task of using stochastic blockmodels to identify boundaries in populations. These results,

and Equations 2 and 3 in particular can moreover be used as a theoretically grounded framework against

which other model selection criteria can be evaluated.

2.2 Alternative Model Selection Criteria

The MDL and MML approaches are not unique in their separate evaluation of model accuracy and model

specificity, nor are they unique in employing a stochastic assessment p(x|θ) in measuring model accuracy.

Consider the AIC (Akaike 1974) and BIC (Raftery 1995) mentioned above. Each of these criteria (to a

scaling factor) can be represented in such a way as to measure the fit of a model to observed data as p(x|θ)

as follows.

AIC(x, θ) = −|θ|+ lg p(x|θ) (6)

BIC(x, θ) = −|θ|2

lg n + lg p(x|θ) (7)

where |θ| is a count of the number of parameters in the model θ, n is equal to the number of observations.

Each of these expressions also corresponds to a logarithmic representation of Equation 3. Using Bayes’

formula and Equation 2 to draw implications about how each of these expressions would assign a prior

probability p(θ) to the observation of a given model yields

pAIC(θ) = 2−|θ| (8)

pBIC(θ) =√

n−|θ|

. (9)

Like the algorithmic complexity approaches outlined above, both of these measures evaluate models with

increasing numbers of parameters as having lower prior probabilities. The principal distinction between

Equation 8 and Equation 9 concerns their respective sensitivity to the number of observations n being

modeled, and the implications of this sensitivity for the assignment of probability to possible models.

Recall the example of Sampson’s monastery discussed above. The blockmodel illustrated in Figure 2

describes one of many possible models of the pattern of affect relations between the monks. A model of

these relations based on three positions would at the very least have nine parameters, corresponding to the

nine cells in the table. In the case where these are dichotomous parameters, there are 29 possible models

10

covering relations between these three positions. Following the logic outlined in the prior section, this has

direct implications for the precision used in representing model parameters. According to Equation 8, every

parameter in a model should only be represented by a single bit. Put differently, in order for penalty imposed

by the AIC to correspond to a prior probability assessment on θ, the independent parameters that compose

θ must be dichotomous.

The first term of Equation 7 has a similar set of implications. As noted by Stine (2004: 243), for large n,

the penalty imposed by the BIC is similar to that of an algorithmic complexity model probability assessment.

However, the BIC penalty forces every parameter to be evaluated as a truncated real-valued parameter, and

assumption that may not be validated in every case. Moreover, as Stine (2004) goes through considerable

trouble to demonstrate, the BIC is not flexible enough to manage the model selection task in which only

limited subsets of possible models are considered by a researcher. In general, straightforward application of

the BIC places limits on the assignment of model priors that may not in every case be appropriate.

Wasserman and Faust (1994) propose a third alternative to model selection specifically in the context

of stochastic blockmodeling. This alternative is only applicable to stochastic blockmodels that define a

predicted value xij for every observed tie value xij . Wasserman and Faust propose that the likelihood-ratio

statistic G2 should be used to assess the goodness-of-fit of a stochastic blockmodel θ,

G2θ = 2

∑i,j

xij log(xij/xij). (10)

Besides the fact that it is unclear how to directly combine this metric with a measure of model accuracy such

as p(x|θ), it is not clear that there is a sound statistical basis to use this metric for many model selection

problems. In general, G2 metrics are used to compare nested models, and there are many model selection

problems, particularly in the context of attempting to locate boundaries in social systems, where there will

be candidate models that are not nested submodels. Moreover, Wasserman and Faust (1994: 703) note that

“this theory should be applied only to a priori stochastic blockmodels, because the ‘data mucking’ that must

be done to fit their a posteriori counterparts invalidates the use of a statistical theory”. The algorithmic

complexity approach outlined here provides a much clearer interpretation of what the model selection task

is, and is as such better suited to the general task of identifying boundaries in social exchange data.

11

3 Algorithmic Complexity and Stochastic Blockmodeling

One of the particularly useful features of the algorithmic complexity approach to model selection is that it

can be used to choose between models that are not nested within each other, or generally characterized by

any specific functional relationship. This has a particular bearing on the problem of identifying groups and

boundaries in social networks due to the wide range of features that can produce exchange behavior that

results in these kinds of structures. Each of these network features or mechanisms typically corresponds to

a parameter or set of parameters that must be included in a given statistical network model. In order to

use the algorithmic complexity approach to compare these models, and in particular to determine L(θ) in

Equation 2, these features must be related to a specific set of model parameters, each with a specified level

of parameter precision.

To that end, this section proceeds as follows. I begin by outlining a baseline set of statistical features

that forms the core of most statistical models of network exchange. While these features do not per se relate

to group and/or boundary structures in networks, they do correspond to well-known features of exchange

patterns in general. The failure to include these features in one form or another in a statistical network

model can distort the estimation of the sought-after group effects. I then discuss a set of network features

and corresponding stochastic blockmodels in which the group memberships actors directly and explicitly

affects the pattern of network exchange. Finally, I discuss a set of local network processes that can produce

group and boundary structures without explicit reference to nominal groups.

3.1 Reciprocity and Individual Variation

Two of the earliest contributions to the statistical study of social networks were the observations that actors

in real social networks are likely to reciprocate the exchange behavior of their interaction partners, and

that these actors are generally differentiated in the extent to which they participate in exchange. The

p1 stochastic network model (Holland and Leinhardt 1981) is an extension of a Bernoulli random graph

that seeks to capture precisely these features of social networks. Given the interest in explicitly modeling

reciprocity as a network feature, the outcome variables in the p1 model are dyads Dij rather than ties xij .

To this end, Holland and Leinhardt (1981) begin with the MAN (mutual, asymmetric, and null) distribution

12

for dyads Dij which states that

mij = p(Dij = (1, 1)), (11a)

aij = p(Dij = (0, 1)), (11b)

aji = p(Dij = (1, 0)), (11c)

and

nij = p(Dij = (0, 0)), (11d)

such that

mij + aij + aji + nij = 1. (12)

The p1 distribution expands upon the reciprocity modeled by the MAN distribution by allowing het-

erogeneity in the exchange behavior of individual actors. Specifically, actors in a network are characterized

in terms of αi, which measures productivity, or the tendency of an actor i to generate ties, and βj , which

measures attractiveness, or the tendency of an actor j to receive ties. Based on these parameters, Holland

and Leinhardt propose the following exponential model for a directed graph:

p(x|θp1) = Kexp

∑i<j

ρijxijxji + λijxij + λjixji

, (13)

where

ρij = log(mijnij/aijaji) (14a)

λij = log(aij/nij) (14b)

K =∏i<j

1kij

, (14c)

and

kij = 1 + eλij + eλji + eρij+λij+λji . (14d)

13

The parameter λ is decomposed as

λij = λ + αi + βj for all i 6= j, (15)

such that

α+ = β+ = 0. (16)

The algorithmic complexity of a baseline p1 model can serve as a null hypothesis against which other

models can be compared. This model is characterized by a set of n−1 productivity and reciprocity parameters

and a single reciprocity parameter, each of which is determined by the entire set of g(g − 1) network ties,

where g is the total number of actors in the network. Accordingly, the algorithmic complexity for a p1 model

can be written as

L(θp1, x) = − lg p(x|θp1) + (2(n− 1) + 1)d(g(g − 1)). (17)

While models that incorporate features that explicitly model group or boundary-sensitive exchange may

be more accurate predictors of the observed behavior in a given network, they will also, in general, be more

specific than this model. As such, in those networks in which there is no statistical evidence for group

structure should have a higher total complexity than that given by Equation 17.

3.2 Explicit Categories

One set of mechanisms that can produce exchange behavior that results in the production of group structure

would be comprised of those mechanisms in which explicit attention to group boundaries is implicated. Such

mechanisms either attribute different kinds of behavior to actors in different categories, or make predictions

about behavior between two actors on the basis of their respective group memberships. Stochastic blockmod-

els based upon these mechanisms can either predict group-oriented individual or dyadic behavior separately,

or can analyze the combined effect of both of these mechanisms simultaneously. Each of these approaches

has different implications for the determination of the algorithmic complexity of a stochastic blockmodel of

network exchange.

A straightforward way to model individual behavior as a function of group structure is to assume that

individual structural attributes are directly determined by the group or category that an actor belongs to.

Anderson et al. (1992) propose a basic extension to the p1 model that does this through placing restrictions

on the productivity and attractiveness parameters αi and βj . Extending the concept of structural equiv-

14

alence (Lorrain and White 1971), they define two actors two be stochastically equivalent if they have the

same probability of sending or receiving ties. The restrictions associated with this version of a stochastic

blockmodel, termed here as a role-dependent stochastic blockmodel (RDB) can formally be stated as

φ(i) = φ(i′) = r ⇒

αi = αi′ = αr

βi = βi′ = βr.

(18)

where φ(·) is a partition mapping such that φ(i) = r implies that an actor i is a member of a group r.

These restrictions lead to the following probability distribution for a social network x given the role-

dependent stochastic blockmodel θRDB :

p(x|θRDB) = Kexp{ρm + λx++ +∑

i

αrxi+ +∑

j

βsx+j} (19)

where r = φ(i), s = φ(j), and m is the total number of mutual ties in the network. The reciprocity parameter

ρ is equivalent to ρij as defined in Equation 14a, restricted to be the same for all dyads. Role-dependent

blockmodels are characterized by the two parameters ρ and λ, and two parameters for each block in a

blockmodel. Each of these parameters is effectively determined from the entire population of ties, so they

should all be optimally represented with d(g(g−1)) bits. Thus, the algorithmic complexity should be written

as

L(θRDB , x) = − lg p(x|θRDB) + (2(b− 1) + 2)d(g(g − 1)). (20)

An alternative to modeling the influence of group structures on the behavior of individual actors is

to model the influence of these structures on dyadic exchange. The pair-dependent stochastic blockmodel

(PSB)—one of the earliest stochastic blockmodels proposed—follows this logic quite explicitly (Holland

et al. 1983). A PSB is a model of a network that focuses on the network dyad vectors Dij = (Xij , Xji),

where Xij is a random variable representing the tie strength between an actor i and and actor j in a network

x. A probability distribution p(x) satisfies a pair-dependent stochastic blockmodel with respect to φ if and

only if:

1. the random vectors Dij are statistically independent, and

2. for any nodes i 6= j and i′ 6= j′ if φ(i) = φ(i′) and φ(j) = φ(j′), then the random vectors Dij and Di′j′

are identically distributed.

15

In other words, ties between actors in a block r = φ(i) and actors in a block s = φ(j) should be drawn

from the same distribution, or the probability of a particular dyad configuration existing between two actors

should depend only on the block assignments of the two actors.

This is a general definition which specifies how the probability distributions from which ties are drawn

should relate to block structure, but says nothing of the particular distributions for a particular block-

pair r × s. Holland et al. define the stochastic blockmodel with reciprocity (SBR) as a special case of the

PSB, where the distribution within a pair block is related to the p1 model (Holland and Leinhardt 1981).

Expanding on Equations 11, they define the parameters of the MAN distribution at the block pair level as

mrs = p(Dij = (1, 1)|φ(i) = r, φ(j) = s), (21a)

ars = p(Dij = (0, 1)|φ(i) = r, φ(j) = s), (21b)

asr = p(Dij = (1, 0)|φ(i) = r, φ(j) = s), (21c)

and

nrs = p(Dij = (0, 0)|φ(i) = r, φ(j) = s), (21d)

such that

mrs + ars + asr + nrs = 1. (22)

These quantities can be used to determine the probability of any given dyad Dij in a network conditioned

on a blockmodel θSBR:

p(x|θSBR) =∏i<j

mxijxjirs axij(1−xji)

rs a(1−xij)xjisr n(1−xij)(1−xji)

rs , (23)

where r = φ(i) and s = φ(j). The authors re-express these block-pair parameters in terms of three other

parameters, λrs, λsr and ρrs, defined as

λrs = log(

ars

nrs

), (24a)

λsr = log(

asr

nsr

)(24b)

ρrs = log(

mrsnrs

arsasr

). (24c)

16

The parameter ρrs is symmetric with respect to a block-pair r × s, and measures the tendency for ties

to be reciprocated within a block-pair. The parameters λrs and λsr measure the tendency for ties to be

asymmetrically sent from block r to block s and from block s to block r, respectively. Holland et al. impose

the restriction ρrs = ρ such that reciprocity is constant within a particular network.

The algorithmic complexity of a stochastic blockmodel with reciprocity θSBR can be determined as

follows. For every block pair r×s there is a parameter λrs. The parameter λrs is derived from the set of ties

from block r to block s, and there are potentially grs such ties for each block-pair. Following Equation 5,

the optimal precision for each of these parameters should be d(grs), where

grs =

grgs if r 6= s

gr(gr − 1) if r = s,(25)

and gr is defined as the total number of actors assigned to a position r. The model complexity must therefore

include the sum of the complexities for each of these parameters∑

r,s d(grs). Additionally, there is a single

parameter ρ which is determined from the set of all possible ties. There are (g(g − 1)) such ties, and thus

the model complexity should further include the complexity of this single parameter d(g(g − 1)).

L(θSBR) = d(g(g − 1)) +∑r,s

d(grs). (26)

Given that the complexity for the data in terms of the model L(x|θ) is always equal to − lg p(x|θ), applying

Equation 2 yields

L(θSBR, x) = − lg p(x|θSBR) + d(g(g − 1)) +∑r,s

d(grs). (27)

While the SBR does model dyadic group-level interaction, it does not allow for the heterogeneity among

actors that the p1 network model accounts for. Wang and Wong (1987) propose a p1 stochastic blockmodel

(P1B) that extends the basic p1 model to incorporate the possibility of within and between group effects on

exchange behavior. The p1 stochastic blockmodel is similar to the role-dependent model inasmuch as both

use block structure to decompose the asymmetric tie parameter λij . In this approach to blockmodeling, a

parameter λrs is introduced as an “interaction term”, which allows block structure to affect the tendencies

for ties to exist between blocks. This parameter is defined through the decomposition of λij , such that

λij = λ + αi + βj + λrs, (28)

17

where r = φ(i) and s = φ(j), and λ is taken as a constant across the network. In addition, λrs is subject

to the side constraints λr+ = 0 and λ+s = 0, which are similar to the constraints typically placed on αi and

βj . These constraints lead to the following probability distribution for a p1 blockmodel θP1B :

p(x|θP1B) = Kexp{ρm + λx++ +∑r,s

λrsx++(rs) +∑

i

αixi+ +∑

j

βjx+j} (29)

where x++(rs) represents the total number of ties in the r×s block. Wang and Wong allow for the possibility

that the λrs parameters could be further restricted, such that some λrs values are forced to be equal. For

instance, they consider the case where only two values of λrs are realized in a model, λrs = λd for diagonal

blocks, and λrs = λo for off-diagonal blocks. This model represents a hypothesis that the likelihood of a

tie between actors in the same position is different than the likelihood of a tie between actors in different

positions. In general, they allow for λrs to take on at most k values, where k ≤ (b− 1)2 because of the side

conditions on λrs.

The P1B has k free block parameters, in addition to the 2(n − 1) parameters for individual actor at-

tractiveness and expansiveness, as well as a λ and ρ for the entire network. Each of these parameters

is effectively determined from the population of all ties and should thus be represented with a precision

d(g(g − 1)). Accordingly, the complexity for a p1 blockmodel is

L(θP1B , x) = − lg p(x|θP1B) + (2(n− 1) + k + 2)d(g(g − 1)). (30)

One stochastic blockmodeling approach that incorporates group effects both on individual and dyadic

behavior is the p2 model (van Duijn et al. 2004). The proposed model is consistent with the p1 model in

that the behavior of actors is captured in the parameters αi and βj , but the p2 model also allows dyadic tie

density λij and reciprocity ρij to vary as well. Actor attributes αi and βj are effectively modeled under p1

as fixed-effects with no other co-variates. Under p2, these actor attributes are modeled as random effects

with a set of covariates Y1 and Y2 such that

αi = Y1iγ1 + Ai (31a)

βj = Y2jγ2 + Bj (31b)

where γ1 and γ2 are regression weights, and the residuals Ai and Bj are randomly distributed with mean 0

18

and variances σ2A and σ2

B respectively. Similarly, dyadic density and reciprocity are modeled as

λij = λ + Z1ijδ1 (32a)

ρij = ρ + Z2ijδ2 (32b)

where δ1 and δ2 are regression weights, and the reciprocity matrix Z2 is constrained to be symmetric such

that Z2ij = Z2ji.

This original specification of the p2 model, to the extent that it only allows a single set of actor attribute

vectors Y1 and Y2 and a single set of dyadic attribute matrices Z1 and Z2 can only be used to model

boundary-sensitive behavior in limited ways. However, subsequent representations of the p2 model (Zijlstra

and Duijn 2003) relax this restriction allowing individual and dyadic group effects to be modeled directly

with multiple dummy variables corresponding to multiple attribute vectors and matrices. A p2 stochastic

blockmodel that has no co-variates other than those indicating group membership then has a parameter for

each group-level parameter αr and βs, as well as a parameter for each group-level dyadic parameter λrs and

ρrs. If these parameters are subject to the constraints α+ = β+ = λr+ = λ+s = ρr+ = ρ+s = 0, then an

expression for the complexity of the p2 stochastic blockmodel can be written as

L(θP2B , x) = − lg p(x|θP2B) + (2(b− 1) + 2)d(g(g − 1)) +∑r,s

d(grs) +∑r<s

d(2grs) +∑

r

d(grr). (33)

3.3 Transitivity and Local Structure

While mechanisms that make explicit reference to category memberships can produce group structure in

social network exchange in a fairly straightforward way, there are other network process that can produce

outcomes that appear quite similar. Specifically, there are a set of local processes and structures that,

when preferentially attended to in a given network, can produce clustering and other outcomes that, on

the surface, are difficult to distinguish from structures produced by the orientation of exchange behavior

to explicit categories. While these outcomes are similar, to the extent that a researcher is interested in

identifying the process by which group structures are formed, it is critical to model network exchange in a

way that is sensitive to the potentially independent presence both of these processes and mechanisms.

A particular local structural mechanism that can lead to clustered social exchanges is a preferential bias

towards transitive closure in triads, particularly in networks that are also characterized by dyadic reciprocity.

In such systems, the establishment of a tie between a member of any existing cluster and an otherwise isolated

19

new actor will either result in that new actor being tied to all other members of the cluster, or the dissolution

of the original tie. While this local process can clearly produce clustered networks as easily as the explicit

structural mechanisms identified in the previous section, it is based on a process which operates on triads of

actors, and as such violates the assumption of dyadic independence that serves as a foundation for the p1

model and its aforementioned progeny.

While the p1 model is constrained by the assumption of dyadic independence, there are other statistical

models of graphs and networks that relax this assumption and account for a wider range of dependency

structures (Besag 1974; Frank and Strauss 1986; Pattison and Robins 2002). In the context of stochastic

models of social networks, these models are referred to as exponential random graph models (ERGM) and a

subset of these are referred to as p∗ models (Wasserman and Pattison 1996; Anderson et al. 1999; Snijders

et al. 2006). As these models are not based on the assumption of dyadic independence, they are able to

consider the effect of transitivity and other local structure on the production of group structure in network

exchange.

A general p∗ model (Wasserman and Pattison 1996; Anderson et al. 1999) predicts the observation of a

particular network as a function of an arbitrary number t of sufficient graph statistics ui(x), such that

p(x|θ) =exp{θ′u(x)}

κ(θ)=

exp{θ1u1(x) + · · ·+ θtut(x)}κ(θ)

(34)

where θ is a vector of linear model parameters κ(θ) is a normalizing constant relative to the set of all

possible networks. Assessing the structure of a given network entails identifying an appropriate set of graph

statistics u(x) that collectively are able to model all micro-structures that could lead to the observed pattern

of exchange. For instance, the set of graph statistics u(b)rs (x), measured as

u(b)rs (x) =

∑φ(i)=r,φ(j)=s,i 6=j

xij (35)

can be used to straightforwardly model the tendency of actors i in a group r to direct exchange at actors j

in a group s. Similarly, the graph statistic

u(t)0 (x) =

∑i 6=j 6=k

xijxjkxik (36)

to the extent that it measures the observed number of transitive triples i, j, k in a graph, can be used as a

measure of transitivity.

20

While the expressive range of ERGMs is particularly useful for modeling group structure in social net-

works, the empirical assessment of these models can be difficult. Modeling the effect of graph constructs like

transitivity that are based on higher-order structures like triads requires that other lower-order structures

(two-stars, dyads) that are embedded in these higher-order structures must be accounted for as well. More-

over, the analytic and empirical evaluation of Equation 34 is difficult, due principally to the intractability

of determining κ(θ). A number of authors have proposed a pseudo-likelihood logit estimator (Strauss and

Ikeda 1990; Wasserman and Pattison 1996; Anderson et al. 1999) to address this problem. However, many

of the properties of even this estimator are not fully-determined, and it is known to be unstable (Snijders

2002; Snijders et al. 2006), particularly with respect to graph statistics corresponding to these lower-order

structures.

In order to address some of these difficulties, an alternative set of graph statistics have been proposed for

ERGMs that are able to model transitivity and associated local structures without necessarily being subject

to these instability issues in estimation (Hunter and Handcock 2006; Snijders et al. 2006; Hunter 2007).

The estimation instability is in part attributed to the fact that models based on first-order graph statistics

place high probability on graphs with high indegree and outdegree. One general solution to this problem

is to use models that include parameters that place decreasingly less weight on the observation of graphs

with high degree. One way to do this is to use a set of weights that decrease geometrically by a factor α.

Graph statistics based on this idea are typically referred to as geometrically weighted measures. Following

this logic, Snijders et al. (2006) define a pair of measures u(od)α (x) and u

(id)α (x) to assess the extent to which

the tendency of actors to have large out-degrees or in-degrees respectively characterize the structure of a

network. These measures are defined as

u(od)α (x) =

n∑i=1

e−αxi+ (37a)

u(id)α (x) =

n∑j=1

e−αx+j , (37b)

where α > 0 is a parameter controlling the geometric rate of decrease in the weights.

While these statistics address the problem of estimability, they introduce α as a variable that must be

chosen by a modeler, rather than as a parameter that can be estimated. Estimating the geometric weighting

factor as well requires that the class of ERGMs be expanded to include curved exponential families, only some

subset of which can be estimated (Hunter and Handcock 2006; Hunter 2007). Within this set of constraints,

21

Hunter (2007) proposes a modified version of the geometrically weighted degree statistics as

u(od)α (x; θs) = eθs

n−1∑i=1

{1− (1− e−θs)i

}ODi(x) (38a)

u(id)α (x; θs) = eθs

n−1∑i=1

{1− (1− e−θs)i

}IDi(x), (38b)

where ODi(x) is the number of actors in a network that have outdegree i, and IDi(x) is the number of

actors in a network that have indegree i.

Applying the idea of geometrically weighted statistics to the modeling of the dependence of network

structure on degree counts is a necessary precursor to modeling transitivity—the construct of particular

interest in this article. The tendency for actors to disproportionally attract attention or direct attention

toward others is modeled in the former case by counts of k-outstars and k-instars, respectively. By a similar

logic, transitivity can be modeled by accounting for the presence and effect of k-transitive triangles and k-

independent directed twopaths. Snijders et al. (2006) propose a set of geometrically weighted graph statistics

to do this as

u(t)η (x) = η

∑i 6=j

xij

{1−

(1η

)L2ij}

(39a)

u(p)η (x) = η

∑i 6=j

{1−

(1η

)L2ij}

, (39b)

where η = eα/(eα − 1) and

L2ij =∑

k 6=i,j

xikxkj (40)

is the number of directed two-paths from i to j. The implied corresponding statistics in the curved exponential

family are then

u(t)η (x; θt) = eθt

n−2∑i=1

{1− (1− e−θt)i

}T3Ti(x) (41a)

u(p)η (x; θp) = eθp

n−2∑i=1

{1− (1− e−θp)i

}I2Pi(x), (41b)

where T3Ti(x) is equal to the number of transitive three-triangles in x and I2Pi is equal to the number of

3-independent two-paths in x (Snijders et al. 2006).

An ERGM that includes all four of these parameters can be used to distinguish the effect of transitivity

22

u(t)η from each of these other effects. Each of these four parameters is derived based on the value of all n(n−1)

ties in the network. A fully specified ERGM would also include a set of u(b)rs parameters, each of which would

be determined strictly by the grs tie values within a given block pair. Accordingly, the complexity of an

ERGM specified with all of these parameters can be determined as

L(θERGM ) = − lg p(x|θERGM ) +∑r,s

d(grs) + 4d(n(n− 1)). (42)

The complexity of a submodel of this ERGM that does not include micro-structural effects but only includes

block pair statistics u(b)rs can be determined by dropping the last term of Equation 42.

Each of the mechanisms identified in this section can be included in a stochastic model to predict the

likelihood of observing a given set of network exchanges x. The parameterization of these mechanisms

can likewise be used to numerically determine the model specificity L(θ), such that competing models can

be compared. While most of the mechanisms identified here are independent of one another and could in

theory be modeled simultaneously, in the following section I restrict attention to a comparison of one explicit

categorical structure—dyadic group-level interaction—and transitivity as a local structural mechanism.

4 Empirical Applications

In this section I present two empirical analyses in order to demonstrate the utility of the algorithmic com-

plexity approach in the assessment of different kinds of structure in network data. The first data set is

derived from an analysis of friendship patterns among sixth-grade boys and girls (Hansell 1984). A second

data set is derived from affect relations between monks in a monastery (Sampson 1968). As such, these

data sources are appropriate applications for the methods presented here for two principal reasons. The

first is that each of these data sets have been extensively analyzed in earlier methodological studies of the

statistical structure of social networks (Holland and Leinhardt 1981; Wang and Wong 1987). These studies

provide a reasonable set of baseline expectations against which the results presented here can be evaluated.

Secondly, the empirical content of these networks are particularly relevant context in which to explore the

question of whether or not group-like structures are the result of externally-imposed categories or micro-level

exchange processes. The network derived from the first data set represents a context in which it would be

perfectly reasonable to expect that gender might serve as an external characteristic that would structure

friendship relations, even in the presence of tendencies toward transitive closure. The structuring mechanism

in the second network, however, is less clear. While there external status markers may have played a role in

23

determining group structure among the monks, it is decidedly less clear what these markers may have been,

which makes this second case an empirically interesting context for the question put forward here.

4.1 Implementation

The examples described below are presented with the intention of replicating existing studies as closely as

possible, making only the modifications necessary to implement the algorithmic complexity approach to

blockmodel selection. To this end, the p1 blockmodeling approach (Wang and Wong 1987) was used in the

analysis of both data sets. ERGM estimates were performed using statnet (Handcock et al. 2005).

Estimation of the p1 stochastic blockmodel parameters entails solving for the estimated counts mij , aij

and nij , following the iterative scaling procedure outlined by Wang and Wong (1987). These estimated values

are used to produce estimates of αi, βj and λrs for comparison purposes. Then, the estimated parameters are

used to determine estimates of the dyad probabilities Dij , which are in turn used to estimate the probability

of the entire observed network in terms of the blockmodel p(x|θ). This network probability is then used to

determine a description length L(x, θ) for each blockmodel analyzed, and comparisons can be made between

these description lengths. Estimates for the dyad probabilities Dij where i < j are determined directly from

the parameter estimates mij , aij and nij using Equations 11. These dyad probabilities estimates Dij are

multiplied together to determine the probability of the observed network given the blockmodel p(x|θ)

p(x|θ) =∏i<j

Dij (43)

This estimate of p(x|θ) is used in Equation 30 to determine the description length L(θP1B , x) of a p1

blockmodel.

While the above discussion illustrates a theoretical rationale for preferring the use of geometrically

weighted transitive closure statistics to the first-order transitive triadic closure statistic u(t)0 (x) in mod-

eling ERGMs, the current implementation of statnet does not support this new measure. As such, while

the ERGM analyses presented here do employ geometrically weighted indegree and outdegree statistics, they

do use this first-order transitivity statistic. The impact of this choice, however, is mitigated by the fact that

the geometrically weighted degree statistics contribute to the stability of the estimation process.

24

4.2 Assessing Structure in Hansell’s Classroom

The analysis of the classroom data (Hansell 1984) by Wang and Wong (1987) poses the question of whether

a model that considers the gendered relationships of the classroom members provides a better description

of the friendship ties than one that does not. The specific two-position blockmodel they test differentiates

between same-sex ties and opposite-sex ties, in addition to modeling the expansiveness and attractiveness

of each actor. Here, I apply the algorithmic complexity approach to answer that same question, specifically

to explore the possibility that a more accurate description is not necessarily a better description. In a

subsequent analysis, I also explore the question of whether or not group structure is a function of behavior

being shaped by explicit categories, or rather the result of micro-structural transitive closure processes.

The p1 stochastic blockmodel analysis begins with the estimation of the parameters mij , aij and nij

for the n = 27 actors in the network. Following Wang and Wong (1987), these parameters are estimated

using an iterative scaling process. These estimates are then used to compute estimates of αi, βj , and the

λrs. The estimates were determined by solving the system of linear equations defined by Equation 28 using

the method of Gaussian elimination. The iterative scaling procedure used to determine the estimates of

mij , aij and nij fit the block marginals of these parameters, so the αi, and βj determined here express

expansiveness and attractiveness are relative to the choice tendencies expressed by the block structure.

Model 1 is a baseline model only including these expansiveness and attractiveness parameters with no group

structure, while Model 2 explicitly estimates choice density between and across gender. The results of this

estimation process are presented in Table 1. The estimated overall choice density for same-sex blocks is

λ00 = λ11 = −0.523, and the estimated overall choice density for opposite-sex blocks is λ10 = λ01 = −2.308.

The next step in the process involves estimating the probabilities of the observed dyads in terms of the

model p(Dij |θ). A probability p(Dij |θ) is estimated for every dyad in the network, and these probabilities

are multiplied together in order to estimate the probability of the observed network given the blockmodel

p(x|θ), as indicated by Equation 43.

In order to estimate the model likelihood L(θ), the optimal precision for the model parameters must be

determined. The sample size for these parameters concerns the n(n−1) = 702 tie strengths to be estimated.

Using Equation 5 I determine that the optimal precision for parameters to be

d(702) =lg 702

2=

9.4552

= 4.727. (44)

25

Table 1: Maximum Likelihood Estimates of αi and βj from Hansell (1984) DataModel 1 Model 2

Student αi βj αi βj

1 -0.038 0.354 0.413 0.1102 -1.23 0.261 -1.474 -0.5533 0.33 -0.745 0.413 -1.4994 1.627 -0.305 1.459 -0.8365 1.478 -0.305 1.848 0.5316 1.093 -0.549 1.049 0.0517 -0.216 -0.908 -0.519 -1.8098 0.598 -0.745 0.561 -1.4999 -0.719 0.190 -0.757 -0.20010 −∞ 0.629 −∞ 0.22411 -2.492 -1.049 -2.295 -0.87412 -0.883 -0.981 -0.633 -0.79613 -1.23 0.118 -1.474 -0.28514 0.308 0.958 0.000 0.91115 -0.642 1.546 -0.315 1.42116 2.995 0.865 3.439 0.94917 -2.018 1.368 -1.609 1.34918 -2.018 0.701 -1.694 1.34919 1.33 -0.434 1.317 -0.02420 1.607 −∞ 1.224 −∞21 -0.28 0.865 -0.510 0.83822 0.547 -0.651 0.503 -0.18823 0.432 -0.745 -0.163 -0.26124 -1.034 0.701 -1.694 0.83825 0.457 0.958 0.909 0.91126 −∞ -1.049 −∞ -0.32927 −∞ -1.049 −∞ -0.329

26

Table 2: Blockmodel Measures Applied to Hansell (1984) Data# Positions G2

θ lg(p(x|θ)) L(θ) L(x, θ)Model 1 1 153.6 -372.7 255.3 627.9Model 2 2 132.1 -328.3 260.0 588.3

Given this result, the determination of L(θ) is straightforward, following directly from Equation 30.

L(θP1B) = (2(n− 1) + k + 2)d(g(g − 1))

= (2(27− 1) + 1 + 2)d(702)

= 55× 4.7276

= 260.02

(45)

The final step is to sum the description length of the model L(θ) with the description length of the

network in terms of the model L(x|θ). A summary of these calculations for both the one and two-position

blockmodels is presented in Table 2, along with the G2θ statistic for each model.

Wang and Wong (1987) argue that the use of blocking information can improve the fit of p1 models.

For reasons argued above, there are many definitions of goodness-of-fit for which this claim is by definition

true. A model with more parameters and thus higher complexity should more accurately predict a set of

observed data than a less complex model. The results presented in Table 2 suggest that the cost of the model

complexity introduced by considering the gender relations of the schoolchildren in the 2-position p1 model

is more than made up for by the gains made in terms of model precision and accuracy. The G2θ statistic

proposed by Wasserman and Faust (1994) also suggests that the 2-position model more accurately represents

the data, as expected.

The algorithmic complexity of the p1 stochastic blockmodels of the classroom data suggests that the

friendship patterns of the sixth-graders are structured in a way that is consistent with their gender differences.

However, this analysis cannot on its own rule out the possibility that transitive closure plays an important

if not primary role in structuring these friendship relations. In order to explore this possibility, I analyze a

series of models that test the separate and combined effect of gender and transitivity in structure friendship

relations in this population.

Model 3 is a baseline ERGM model analogous to the the 1-position undifferentiated p1 model analyzed

above. While this model does not include separate attractiveness and productivity parameters αi and βj , it

does include the geometrically weighted indegree and outdegree graph statistics u(id)α (x; θs) and u

(od)α (x; θs).

27

Table 3: ERGM Parameter Estimates for Hansell (1984) DataModel 3 Model 4† Model 5 Model 6†

Mutuality -0.080 -0.028 -0.633 -0.220(0.740) (0.014) (0.794) (0.996)

Indegree -6.034 -1.184 -6.472 -1.383(3.779) (0.014) (13.110) (1.443)

Outdegree -16.090 -1.206 -13.045 -1.484(2.836) (0.014) (19.525) (1.403)

Transitivity 0.216 0.274(0.838) (0.124)

Gender Match 1.104 0.785(0.819) (0.367)

Choice 21.200 0.303 18.325 0.282(2.438) (0.014) (15.832) (2.347)

− lg(p(x|θ)) 659.40 559.49 617.72 542.04L(θ) 17.41 21.64 22.14 26.37L(x, θ) 676.81 581.13 639.86 568.41†parameter estimation generated degenerate samples

While these statistics do not explicitly model individual variation in the likelihood of engaging in social

behavior, they do allow networks to be modeled in which there is individual variation across these tendencies.

In as much as this does not add a pair of new model parameters for each additional actor in a system, it can

be a more efficient if not more realistic way to model this potential variance. The estimated total complexity

of Model 3 is somewhat higher than that of Model 1, suggesting that individual variance may in fact be

important in this small network.

Model 4 expands upon Model 3 by introducing a graph statistic for transitivity (ut(x) - the number

of transitive triads). While the degenerate samples generated during parameter estimation raises some

concern about the accuracy of the estimates, this model results in a substantial decrease in total complexity.

Model 5 produces a similarly large decrease in complexity by introducing a model statistic to account for the

tendency of friendship ties to be directed between students of the same gender. Model 6 includes both of these

parameters, resulting in a substantial improvement in total complexity over both Models 4 and 5, suggesting

that both transitivity and gender as a category are important structural features of this network. These

results are largely consistent with prior expectations about friendship relations in a sixth-grade classroom.

Transitive closure is a commonly observed property of most positive social relations, but it alone does not

explain the observed pattern of group structure. Rather, these results suggest that gendered pattern of

friendship relations is not a random artifact, but may be reasonably attributed to the salience of gender as

28

Table 4: Candidate Partitions of Sampson’s MonksGroup Actors

Model 7 n/a All actors in one groupModel 8 Loyal Opposition Ramauld, Bonaventure, Ambrose, Berthold, Peter, Louis, Victor

Young Turks Winfrid, John Bosco, Gregory, Hugh, Boniface, Mark, AlbertOutcasts Amand, Basil, Elias, Simplicus

Model 9 Hangers On Ramauld, Bonaventure, AmbroseLoyal Opposition Berthold, Peter, Louis, VictorYoung Turk Leaders Winfrid, John Bosco, GregoryYoung Turks Hugh, Boniface, Mark, AlbertOutcasts Amand, Basil, Elias, Simplicus

Model 10 n/a Each actor in his own group

a category in the choices of these students.

4.3 Assessing Structure in Sampson’s Monastery

The analysis presented in the preceding section details how the algorithmic complexity approach can be

applied to an empirical problem of blockmodel selection. However, the two models proposed as candidate

blockmodels for the classroom data (Hansell 1984) by Wang and Wong (1987) cannot fully demonstrate

how the algorithmic complexity approach can effectively trade off between model accuracy and complexity.

In that example, the two-position model is both more accurate and a better model in terms of description

length. In this section, we present an analysis of a set of blockmodels for which the description length of the

model does not decrease monotonically with model accuracy.

In the analysis by White et al. (1976) of the monastery data (Sampson 1968), a number of alternative

blockmodels are proposed of varying degrees of complexity. In particular, White, Boorman and Breiger

present three- and five-position models of the social structure of the monastery corresponding to Models

8 and 9 respectively as presented here. In this section, we use the algorithmic complexity framework to

analyze those two models in addition to Model 7—a model in which actors are undifferentiated by position

and Model 10—a model in which each monk is assigned to his own position. Table 4 displays the monks

assigned to each position in each of these models.

As in the preceding section, we apply the p1 stochastic blockmodel (Wang and Wong 1987) as an un-

derlying probability distribution. Given this approach, we are left with the choice of how to assign the λrs

parameters to equivalence groups. Wang and Wong propose a number of methods by which this can be

done, each method representing a different specific hypothesis about the basic structure of within-position

and inter-position ties. In their original analysis, White, Boorman and Breiger do not assume a particular

29

Table 5: Maximum Likelihood Estimates of αi and βj from Sampson (1968) DataModel 7 Model 8 Model 9

Monk αi βj αi βj αi βj

Ramauld 1.039 −∞ -0.346 −∞ 3.782 −∞Bonaventure -0.603 1.391 -1.015 0.024 -0.329 0.970Ambrose -0.348 0.546 -1.314 -1.741 -0.329 0.970Berthold 0.019 -0.914 -1.022 -3.079 -1.495 -6.373Peter -0.348 0.546 -1.314 -1.741 -1.203 -1.446Louis 0.019 -0.552 -1.022 -3.350 -1.495 -3.500Victor 0.318 -0.552 -1.227 -3.350 -1.900 -3.500Winfrid -0.603 1.391 -0.734 1.678 -0.112 2.213John Bosco -0.185 0.546 -0.802 0.009 -0.112 5.084Gregory -0.603 1.391 -0.734 1.678 -0.112 2.213Hugh 0.318 -0.552 -0.802 0.009 -1.495 -0.348Boniface 0.318 -0.914 -0.802 -0.660 -1.495 -4.459Mark -0.603 0.649 -0.802 1.108 -1.355 -0.348Albert 0.318 -0.552 -0.802 0.009 -1.495 -0.348Amand 0.318 -0.552 2.876 2.292 2.286 1.388Basil 0.496 -0.335 4.043 2.370 2.286 2.665Elias 0.318 -0.985 2.876 2.449 2.286 2.335Simplicius -0.185 -0.552 2.944 2.292 2.286 2.486

Table 6: Blockmodel Measures Applied to Sampson (1968) Data# Positions G2

θ lg(p(x|θ)) L(θ) L(x, θ)Model 7 1 85.1 -170.9 148.6 319.5Model 8 3 47.6 -116.0 165.1 281.2Model 9 5 42.5 -107.1 214.7 321.8Model 10 18 0.0 0.0 1341.8 1341.8

structure of zero-blocks and one-blocks. In an effort to closely model their analysis, we present results for

the fully-saturated p1 blockmodel, that allows the maximum flexibility in determining λrs for each block.

Table 5 presents the estimates αi and βj for each model for which these estimates could be determined.

Estimates of αi and βj could not be determined for Model 10, as the system of equations needed to determine

these values is underdetermined as suggested in Section 4.1. However, estimates of m, a and n can be

determined for all of these matrices using the iterative scaling procedure, so the tie probabilities pij and the

likelihood p(x|θ) were estimated for all four blockmodels. The estimates of p(θ), p(x|θ) and L(x, θ) for each

model are presented, along with the corresponding G2 statistic, in Table 6.

These results more clearly demonstrate how a model selection criterion based on algorithmic complexity

can effectively be used to identify boundaries in a social system. As the number of blocks grows, the

precision and accuracy of the model grows, as measured both by p(x|θ) and G2. At the same time, the model

30

complexity, as measured by L(θ) grows as well. The description length L(x, θ) presented here suggests that

the three-position Model 9 is the best-fit model of the four considered here, which matches the observational

intuition of Sampson (1968).

While these results are helpful in locating the boundaries between the actors in this social system, like

the p1 analysis of the classroom presented above, they cannot usefully speak to the extent to which the

apparent grouping has to do with categorical distinctions per se. While there is a reasonable a priori reason

to expect that sixth-graders might shape their social interactions along pre-existing categorical lines (e.g.

gender), there is somewhat less of a reason to expect the monks studied by Sampson (1968) to do the same.

Models 11-16 address this question by examining the separate and combined relevance of transitivity and

group structure to affect in this social system. The results of each of these models are presented in Table 7.

Model 11, like Model 3 in the prior analysis, is a baseline ERGM model of affect relations in the monastery.

Like the earlier ERGM models, this model does not estimate individual-level productivity and attractiveness,

but rather examines the general relevance of differences in the production and receipt of affect to structure

the entire network. While the estimated likelihood of Model 11 is somewhat lower than that of Model 7, it

is noteworthy that the total complexity is lower, suggesting that specific individual attributes relatively less

important in this context. It is also noteworthy that estimated coefficient mutuality is consistently significant

in all six models examined here.

Model 12 introduces a statistic for transitivity, which results in a prediction −lg(p(x|θ)) that is signifi-

cantly higher than Model 11, as well as a very significant reduction in model complexity. This is particularly

interesting in comparison to Model 13, which introduces model statistics sensitive to the dyad of positions

that characterize a pair of actors. The total complexity for Model 13 is somewhat higher than that of Model

12, suggesting that transitivity is more important than group identities in explaining the overall pattern of

interaction. This result does not suggest that there are not meaningful boundaries around the three groups,

but rather a likelihood that these boundaries are the result of the micro-process of transitive closure rather

than result of some salient social characteristic.

The salience of the boundaries around the three groups is reflected in a comparison of the total complexity

of Models 11, 13, and 15, and in a comparison of the total complexity of Models 12, 14, and 16. In each of

these cases, the general pattern of a three-group model being preferred to a one-group or five-group model

is shown as it was in the p1 stochastic blockmodel analysis presented above.

31

Table 7: ERGM Parameter Estimates for Sampson (1968) DataModel 11 Model 12 Model 13 Model 14 Model 15 Model 16

Mutuality 2.590 2.245 1.363 1.340 1.192 1.082(0.497) (0.033) (0.567) (0.569) (0.584) (0.590)

Indegree -1.877 -2.293 -2.727 -3.639 -2.682 -3.805(0.921) (0.033) (1.043) (1.265) (1.047) (1.253)

Outdegree 1.788 -0.007 2.081 2.330 1.544 2.210(2.753) (0.033) (4.315) (7.447) (3.318) (8.004)

Transitivity 0.258 -0.206 -0.287(1.591) (0.145) (0.162)

3-Block Effects Y Y5-Block Effects Y Y

Choice -2.092 2.727 -3.179 -2.427 -2.951 -2.383(2.727) (0.033) (4.503) (7.968) (3.571) (8.214)

− lg(p(x|θ)) 222.42 174.34 180.67 177.26 173.07 168.10L(θ) 15.01 19.14 31.31 35.42 46.32 50.45L(x, θ) 237.43 193.48 211.98 212.70 219.39 218.55

5 Discussion

Given the interest on the part of a wide range of social scientists in social boundaries and the processes that

attend to them, it is rather surprising that so little attention has been paid to their empirical identification.

One of goals of sociological methods should be to provide tools that can be used with empirical data to either

support or call into question sociological theory and constructs. If boundaries figure centrally into theories

of structure, categories and type, then it is particularly important to develop methods that can empirically

establish their presence and significance. To the extent that a method can not in principle provide evidence

that boundaries fail to play a significant role in a particular system, that method cannot in general be used in

other cases to support the claim that they do. The blockmodeling technique proposed by White et al. (1976)

and subsequent methods provide tools that can effectively be used to elicit possible social structures from

network data, and as such can be quite useful for supporting theories concerning roles and positions in social

groups. We have few tools, however, that can establish, for example, that boundaries are unimportant in a

particular social network, or that the emergence of group structure is attributable to some other micro-level

process.

To this end, in this article I have argued that analyses of social networks based on the idea of algorithmic

complexity are uniquely qualified in terms of their ability to identify social structure. Not only can the method

32

outlined here generate falsifiable hypotheses about the presence or absence of structure, but it clearly relates

the selection of a particular model to specific claims about both the fit of the model to the observed network

and to the fundamental likelihood of observing the model itself. The underlying probabilistic nature of these

claims can moreover be used to show how a model selection framework based on the idea of algorithmic

complexity compares favorably to model selection approaches such as the AIC (Akaike 1974) and the BIC

(Raftery 1995). An additional feature of the approach outlined here is that it can be used with any stochastic

model of social network exchange. An implication of this is that, to the extent that models like the ERGM

(Snijders et al. 2006) can simultaneously model both explicit group structure and group structure produced

by local exchange processes, the significance of both of these potential antecedents of group structure can

independently be evaluated.

While the approach outlined here is reasonably flexible in its ability to incorporate existing statistical

models of social network exchange, it is nevertheless limited in its ability to assess those sociologically

interesting forms of group structure that are yet to be represented in a stochastic network model. Particularly

noteworthy among these is the idea of regular equivalence (White and Reitz 1983), a generalization of

structural equivalence in which actors are judged to be equivalent if they have a similar pattern of relations

to other equivalent actors, rather than similar relations to exactly the same actors. This type of equivalence

structure is more characteristic of many common identity-based relations (parent-child, employer-employee,

core state-peripheral state) than is structural equivalence. However, while structural equivalence can be

straightforwardly captured in a statistical model of network exchange as demonstrated in the models reviewed

here, measures of regular equivalence have as yet failed to appear in these models. If group structure in

a particular social system is in fact driven by regular equivalence rather than structural equivalence, a

straightforward application of the methods presented here could lead to misleading results.

Limitations of the ability to model particular features notwithstanding, the approach presented here

represents a helpful step forward in the statistical analysis of structure in social networks. The empirical

examples presented here demonstrate not only how the method can be used to identify boundaries in a social

system, but also, as in the Sampson’s monastery example, how group structure can be identified that may

be the result of the local structure of exchange rather than that of explicit boundaries. While the results

presented here are consistent with those presented in earlier research, these can be distinguished by the

relative strength of the statistical basis upon which they are based. There has been a relative paucity of

recent research in which the empirical identification of boundaries has played a central role. If this can be

explained by the lack of a solid theoretical and methodological basis upon which to base such research, then

33

the algorithmic complexity approach presented here should serve as a useful contribution.

References

Abbott, Andrew and Alexandra Hrycak, 1990. “Measuring Resemblance in Sequence Data: An Optimal

Matching Analysis of Musicians’ Careers.” American Journal of Sociology 96:144–185. ISSN 0002-9602.

Akaike, Hirotugu, 1974. “A New Look at the Statistical Model Identification.” IEEE Transactions on

Automatic Control 19:716–623. ISSN 0018-9286.

Alderson, Arthur S. and Jason Beckfield, 2004. “Power and Position in the World City System.” American

Journal of Sociology 109:811–851.

Anderson, Carolyn J., Stanley Wasserman, and Bradley Crouch, 1999. “A p∗ Primer: Logit Models for

Social Networks.” Social Networks 21:37–66.

Anderson, Carolyn J., Stanley Wasserman, and Katherine Faust, 1992. “Building Stochastic Blockmodels.”

Social Networks 14:137–161.

Besag, Julian, 1974. “Spatial Interaction and the Statistical Analysis of Lattice Systems.” Journal of the

Royal Statistical Society. Series B (Methodological) 36:192–236. ISSN 0035-9246.

Burnham, Kenneth P. and David R. Anderson, 2004. “Multimodel Inference: Understanding AIC and BIC

in Model Selection.” Sociological Methods and Research 33:261–304. doi:10.1177/0049124104268644.

Carrington, Peter J. and Greg H. Heil, 1981. “COBLOC: A Hierarchical Method for Blocking Netowrk

Data.” Journal of Mathematical Sociology 8:103–131.

Carrington, Peter J., Greg H. Heil, and Stephen D. Berkowitz, 1979. “A Goodness-of-fit Index for Block-

models.” Social Networks 2:219–234.

Chaitin, Gregory J., 1966. “On the Length of Programs for Computing Finite Binary Se-

quences.” Journal of the Association for Computing Machinery 13:547–569. ISSN 0004-5411. doi:

http://doi.acm.org/10.1145/321356.321363.

Fienberg, Stephen E. and Stanley Wasserman, 1981. “Categorical Data Analaysis of Single Sociometric

Relations.” In “Sociological Methodology 1981,” , edited by Samuel Leinhardt, pp. 156–192. San Francisco:

Jossey-Bass.

34

Frank, Ove and David Strauss, 1986. “Markov Graphs.” Journal of the American Statistical Association

81:832–842. ISSN 0162-1459.

Gerlach, Michael L., 1992. “The Japanese Corporate Network: A Blockmodel Analysis.” Administrative

Science Quarterly 37:105–139.

Han, Shin-Kap and Phyllis Moen, 1999. “Clocking Out: Temporal Patterning of Retirement.” American

Journal of Sociology 105:191–236. ISSN 0002-9602.

Handcock, Mark S., David R. Hunter, Carter T. Butts, and Martina Morris, 2005. “statnet: An R Package

for the Statistical Analysis and Simulation of Social Networks.”

Hansell, Stephen, 1984. “Cooperative Groups, Weak Ties, and the Integration of Peer Friendships.” Social

Psychology Quarterly 47:316–328.

Holland, Paul W., Kathryn B. Laskey, and Samuel Leinhardt, 1983. “Stochastic Blockmodels: Some First

Steps.” Social Networks 5:109–137.

Holland, Paul W. and Samuel Leinhardt, 1981. “An Exponential Family of Probability Distributions for

Directed Graphs.” Journal of the American Statistical Association 76:33–65.

Hunter, David R., 2007. “Curved Exponential Family Models for Social Networks.” Social Networks (forth-

coming) .

Hunter, David R and Mark S Handcock, 2006. “Inference in Curved Exponential Family Models for Net-

works.” Journal of Computational and Graphical Statistics 15:565–583.

Kolmogorov, Andrey N., 1965. “Three approaches to the quantitative definition of complexity.” Problems

in Information Transmission 1:4–7.

Krackhardt, David, 1987. “Cognitive Social Structure.” Social Networks 9:109–134.

Kuha, Jouni, 2004. “AIC and BIC: Comparisons of Assumptions and Performance.” Sociological Methods

and Research 33:188–229. doi:10.1177/0049124103262065.

Lamont, Michele and Virag Molnar, 2002. “The Study of Boundaries in the Social Sciences.” Annual Review

of Sociology 28:167–195.

Laumann, Edward O., Joseph Galaskiewicz, and Peter V. Marsden, 1978. “Community Structure as Interor-

ganizational Linkages.” Annual Review of Sociology 4:455–484. doi:10.1146/annurev.so.04.080178.002323.

35

Lorrain, Francois P. and Harrison C. White, 1971. “Structural Equivalence of Individuals in Social Networks.”

Journal of Mathematical Sociology 1:48–80.

Nowicki, Krzysztof and Tom A. B. Snijders, 2001. “Estimation and Prediction for Stochastic Blockstruc-

tures.” Journal of the American Statistical Association 96:1077–1087.

Panning, William H., 1982. “Fitting Blockmodels to Data.” Social Networks 4:81–101.

Pattison, Philippa and Garry Robins, 2002. “Neighborhood-Based Models for Social Networks.” Sociological

Methodology 32:301–337. ISSN 0081-1750.

Phillips, Damon J. and Ezra W. Zuckerman, 2001. “Middle-Status Conformity: Theoretical Restatement

and Empirical Demonstration in Two Markets.” American Journal of Sociology 107:379–429.

Raftery, Adrian E., 1995. “Bayesian Model Selection in Social Research.” In “Sociological Methodology

1995,” , edited by Peter V. Marsden, pp. 111–196. San Francisco: Jossey-Bass.

Rissanen, Jorma, 1983. “A Universal Prior for Integers and Estimation by Minimum Description Length.”

The Annals of Statistics 11:416–431. ISSN 0090-5364.

———, 1989. Stochastic Complexity in Statistical Inquiry. Teaneck, N.J.: World Scientific.

Sampson, Samuel F., 1968. A Noviate in a Period of Change: An Experimental and Case Study of Relation-

ships. Ph.D. thesis, Cornell University.

Shannon, Claude E., 1948. “A Mathematical Theory of Communication.” Bell System Technical Journal

27:379–423.

Snijders, Tom A. B., 2002. “Markov Chain Monte Carlo Estimation of Exponential Random Graph Models.”

Journal of Social Structure 3.

Snijders, Tom A. B. and Krzystof Nowicki, 1997. “Estimation and Prediction for Stochastic Blockmodels

for Graphs with Latent Block Structure.” Journal of Classification 14:75–100. ISSN 0176-4268. doi:

10.1007/s003579900004.

Snijders, Tom A. B., Philippa E. Pattison, Garry L. Robins, and Mark S. Handcock, 2006. “New

Specifications for Exponential Random Graph Models.” Sociological Methodology 36:forthcoming. doi:

10.1111/j.1467-9531.2006.00171.x.

36

Snyder, David and Edward L. Kick, 1979. “Structural Position in the World System and Economic Growth,

1955-1970: A Multiple-Network Analysis of Transnational Interactions.” American Journal of Sociology

84:1096–1126. ISSN 0002-9602.

Solomonoff, Raymond J., 1964. “A Formal Theory of Inductive Inference, Part 1 and Part 2.” Information

and Control 7:224–254.

Stine, Robert A., 2004. “Model Selection Using Information Theory and the MDL Principle.” Sociological

Methods and Research 33:230–260. doi:10.1177/0049124103262064.

Stovel, Katherine, Michael Savage, and Peter Bearman, 1996. “Ascription into Achievement: Models of

Career Systems at Lloyds Bank, 1890-1970.” American Journal of Sociology 102:358–399. ISSN 0002-

9602.

Strauss, David and Michael Ikeda, 1990. “Pseudolikelihood Estimation for Social Networks.” Journal of the

American Statistical Association 85:204–212. ISSN 0162-1459.

van Duijn, Marijtje. A. J., Tom A. B. Snijders, and Bonne J. H. Zijlstra, 2004. “p2: a Random Effects Model

with Covariates for Directed Graphs.” Statistica Neerlandica 58:234–254.

Wallace, Christopher S. and David M. Boulton, 1968. “An Information Measure for Classification.” Computer

Journal 11:185–194.

Wallace, Christopher S. and David L. Dowe, 1999. “Minimum Message Length and Kolmogorov Complexity.”

The Computer Journal 42:270–283.

Wang, Yuchung J. and George Y. Wong, 1987. “Stochastic Blockmodels for Directed Graphs.” Journal of

the American Statistical Association 82:8–19.

Wasserman, Stanley and Katherine Faust, 1994. Social Network Analysis: Methods and Applications. Cam-

bridge: Cambridge University Press.

Wasserman, Stanley and Phillipa Pattison, 1996. “Logit Models and Logistic Regression for Social Networks:

I. An Introduction to Markov Graphs and p∗.” Psychometrika 61:401–425.

White, Douglas R. and Karl P. Reitz, 1983. “Graph and Semi-group Homomorphisms on Networks of

Relations.” Social Networks 6:193–235.

37

White, Harrison C., Scott C. Boorman, and Ronald L. Breiger, 1976. “Social Structure from Multiple

Networks. I. Blockmodels of Roles and Positions.” American Journal of Sociology 81:730–779.

Zijlstra, Bonne J. H. and Marijtje A. J. Duijn, 2003. Manual p2. Version 2.0.0.7. iec ProGAMMA/University

of Gronigen, Gronigen.

38

Documents

Algorithmic Complexity and Structural Models of Social ...scripts.mit.edu/~cwheat/research/modelsel.20070416.pdf · 4/16/2007 · 1 stochastic blockmodels and exponential random