Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
Estimating Psychological Networks and their Stability: a Tutorial Paper
Authors: Sacha Epskamp*1, Denny Borsboom1 & Eiko I. Fried2
Affiliations: 1Department of Psychology, University of Amsterdam, Amsterdam, The Netherlands.
2Faculty of Psychology and Educational Sciences, University of Leuven, Leuven,
Belgium.
*Corresponding author: Sacha Epskamp, Department of Psychological Methods, University of Amsterdam. Nieuwe Achtergracht 129, 1018 WT Amsterdam, The Netherlands; tel: +31 6 25690541, email: [email protected]
STABILITY OF PSYCHOLOGICAL NETWORKS
2
Abstract
Over the course of the last years, network research has gained increasing popularity in psychological
sciences. Especially in clinical psychology and personality research, such psychological networks
have been used to complement more traditional latent variable models. While prior publications have
tackled the two topics of network psychometrics (how to estimate networks) and network inference
(how to interpret networks), the topic of network stability still provides a major challenge: it is unclear
how susceptible network structures and graph theoretical measures derived from network models are
to sampling error and choices made in the estimation method, greatly limiting the certainty of
substantive interpretation. In this tutorial paper, we aim to address these challenges. First, we
introduce the current state-of-the-art of network psychometrics and network inference in psychology
to allow readers to use this tutorial as a stand-alone to estimate and interpret psychopathological
networks in R. Second, we describe how bootstrap routines can be used to assess the stability of
network parameters. To facilitate this process, we developed the freely available R-package bootnet.
We apply bootnet, accompanied by R syntax, to a dataset of 359 women with posttraumatic stress
disorder available online. We show how to estimate confidence intervals around edge weights, and
how to examine stability of centrality indices by subsampling nodes and persons. This tutorial is
aimed at researchers with all levels of statistical experience, and can also be used by beginners.
Keywords: Gaussian Graphical Model; Ising Model; LASSO; network models; bootstrap; tutorial
STABILITY OF PSYCHOLOGICAL NETWORKS
3
Introduction
In the last five years, network research has gained substantial attention in psychological
sciences (Borsboom & Cramer, 2013). Novel statistical models have been developed that allow for
estimating psychological network structures (Bringmann et al., 2013; van Borkulo et al., 2014).
Psychological networks consist of nodes representing observed variables, connected by edges
representing statistical relationships. Such networks are strikingly different from, e.g., power grids
(Watts & Strogatz, 1998), social networks (Wasserman & Faust, 1994) or ecological networks (Barzel
& Biham, 2009), in which nodes represent entities (e.g., airports, people, organisms) and connections
are generally observed and known (e.g., electricity lines, friendships, mutualistic relationships). In
psychological networks, the goal is to find pairwise interactions, predictive effects or even potentially
causal relationships between observed psychological (and possibly biological) nodes. These
connections are not known, and thus need to be estimated. Recent statistical advances allow for the
estimation of such unknown network structures on datasets typically obtained in psychological
research (e.g., Barber & Drton, 2015; Foygel & Drton, 2010; Friedman, Hastie, & Tibshirani, 2008;
Haslbeck & Waldorp, 2015; van Borkulo et al., 2014). These methods have been implemented in
statistical software specifically aimed at psychologists without advanced statistical knowledge
(Epskamp, Cramer, Waldorp, Schmittmann, & Borsboom, 2012; van Borkulo & Epskamp, 2014), and
tutorials have emerged that detail important steps in estimating and analyzing psychological networks
(Costantini et al., 2015). Adopting this new methodology has allowed researchers to explore
important substantive questions, for instance in personality research (e.g., Costantini et al., 2015) and
clinical psychology (e.g., Cramer, Waldorp, van der Maas, & Borsboom, 2010; Fried et al., 2015;
Goekoop & Goekoop, 2014; Mcnally et al., 2014).
While network psychometrics (the estimation of psychological networks) and network
inference (the interpretation of networks) have been addressed in prior work, an aspect that has
received little attention so far is the parameter stability of networks, which we will simply call
stability. With stability, we mean that it is as of yet unclear how susceptible estimated network
structures and graph theoretical measures derived from network models are to sampling error and
choices made in the estimation method. This problem of stability is omnipresent in statistics. Imagine
STABILITY OF PSYCHOLOGICAL NETWORKS
4
researchers employ a regression analysis to examine three predictors of depression severity, and
identify one strong, one weak, and one unrelated regressor. If removing one of these three regressors,
or adding a fourth one, substantially changes the regression coefficients of the other regressors, results
are unstable and depend on specific decisions the researchers make, implying a problem of stability.
The same holds for psychological networks. Due to the current uncertainty, there is the danger to
obtain network structures sensitive to specific variables included, or sensitive to specific estimation
methods. This poses a major challenge, especially when substantive interpretations such as treatment
recommendations in the psychopathological literature, or the generalizability of the findings, are
important. The current replication crisis in psychology (Open Science Collaboration, 2015) stresses
the crucial importance of obtaining robust results, and we want the emerging field of
psychopathological networks to start off on the right foot.
This article has two main goals. First, we describe the current state-of-the-art for
psychological network analysis: how are such network structures best estimated, and how can we use
these structures for statistical inference? We include this part to make this tutorial stand-alone
readable for psychological researchers. Second, we describe three different bootstrapping techniques
that cover the most important network stability tests. We do so using the novel package bootnet
(Epskamp, 2016a) developed for the free statistical programming environment R (R Development
Core Team, 2008), and demonstrate the package's functionality in a dataset of 359 women with post-
traumatic stress disorder (PTSD; Hien et al., 2009) that can be downloaded from the Data Share
Website of the National Institute on Drug Abuse. It is important to note that the focus of our tutorial is
on cross-sectional network models that can readily be applied to many current psychological datasets.
We hope that this tutorial will enable researchers to gauge the stability and certainty of the results
obtained from network models, and to provide editors, reviewers, and readers of psychological
network papers the possibility to better judge whether substantive conclusions drawn from such
analyses are defensible.
STABILITY OF PSYCHOLOGICAL NETWORKS
5
Network psychometrics and network inference
A psychological network is a model in which nodes represent observed psychological variables—
usually psychometric test items such as responses to questions about whether a person suffered from
insomnia or fatigue in the past weeks—which are connected by edges that indicate some statistical
relationship between them. These models are conceptually different from commonly used reflective
latent variable models that explain the co-occurrence among symptoms—the fact that individuals
often suffer from sadness, insomnia, fatigue and concentration problems at the same time—by a
causal of an underlying unobserved latent trait—e.g., depression. Psychological networks offer a
different conceptual interpretation of the data and explain such co-occurrences via direct relationships
between symptoms; someone who sleeps poorly will be tired, and someone who is tired will not
concentrate well (Fried, 2015; Schmittmann et al., 2013). Such relationships can then be more easily
interpreted when drawn as a network structure, in which edges indicate pathways on which nodes can
affect each other. The edges can differ in strength of connection, also termed the edge weight
(Epskamp et al., 2012) indicating if a relationship is strong (commonly visualized with thick edges) or
weak (thin, less saturated edges), and positive (green edges) or negative (red edges). After a network
structure is estimated, the visualization of the graph itself tells the researcher a detailed story of the
multivariate dependencies in the data. Additionally, many inference methods from graph theory can
be used to assess which nodes are the most important nodes in the network, termed the most central
nodes.
Directed and undirected networks
In general, there are two types of edges that can be present in a network; an edge can be
directed, in which case one head of the edge has an arrowhead indicating a one-way effect, or an edge
can be undirected, indicating some mutual relationship. A network that contains only directed edges is
termed a directed network, whereas a network that contains only undirected edges is termed an
undirected network (Newman, 2010). Directed networks have been of great interest in many fields as
they can be used to encode causal structures (Pearl, 2000). For example, the edge insomnia à fatigue
can be taken to indicate that insomnia causes fatigue. The work of Pearl describes that such causal
structures can be tested using only observational cross-sectional data, and even estimated to a certain
STABILITY OF PSYCHOLOGICAL NETWORKS
6
extend (Kalisch et al., 2012; Scutari, 2010). However, when temporal information is lacking, there is
only limited information present in cross-sectional observational data. Such estimation methods
typically only work under the very strict assumptions that all entities that play a causal role have been
measured and the causal chain of cause and effect is not cyclic (a variable cannot cause itself via any
path). Both assumptions are not very plausible in psychological systems. Furthermore, such directed
networks suffer from the problem that many equivalent models can exist that feature the same
relationships found in the data (MacCallum & Browne, 1993), which makes the interpretation of
structures difficult. For example, the structure insomnia à fatigue à concentration is statistically
equivalent to the structure insomnia ß fatigue à concentration and the structure insomnia ß fatigue
ß concentration: all three only indicate that insomnia and concentration problems are conditionally
independent after controlling for fatigue.
For the reasons outlined above, psychological networks estimated on cross-sectional data are
typically undirected networks. The current state-of-the-art of estimating undirected psychological
network structures involves the estimation of Pairwise Markov Random Fields (PMRF; Costantini,
Epskamp, et al., 2015; van Borkulo et al., 2014). A PMRF is a network model in which edges indicate
the full conditional association between two nodes after conditioning on all other nodes in the
network. This means that, when two nodes are connected, there is a relationship between these two
nodes that cannot be explained by any other node in the network. Simplified, it can be understood as a
partial correlation controlling for all other connections. In addition, the absence of an edge between
two nodes indicates that these nodes are conditionally independent of each other given the other nodes
in the network. Thus, a completely equivalent undirected structure to the structures described above
would be insomnia — fatigue — concentration, indicating that insomnia and concentration problems
are conditionally independent after controlling for fatigue.
STABILITY OF PSYCHOLOGICAL NETWORKS
7
Figure 1. Example network
Figure 1 shows a PMRF similar to the example described above. In this network, there is a
positive relationship between insomnia and fatigue, and a negative relationship between nodes fatigue
and concentration. The positive edge is thicker and more saturated than the negative edge, indicating
that this interaction effect is stronger than that of the negative edge. This network shows that insomnia
and concentration do not directly interact with each other in any way, only through their common
connection with fatigue. Therefore, fatigue is the most important node in this network – a concept we
will later quantify as centrality. These edges can be interpreted in several different ways. First, as
shown above, the model is in line with causal interpretations of associations among the symptoms.
Second, this model implies that insomnia and fatigue predict each other after controlling for
concentration; even when we know someone is concentrating poorly, that person is more likely to
suffer from insomnia when we observe that person to suffer from fatigue. Similarly, fatigue and
concentration predict each other after controlling for insomnia. Furthermore, after controlling for
fatigue there is no longer any predictive quality between insomnia and concentration, even though
these variables are correlated; fatigue now mediates the prediction between these two symptoms
A
B
C
STABILITY OF PSYCHOLOGICAL NETWORKS
8
(Epskamp, Maris, Waldorp, & Borsboom, 2016). Finally, these edges can represent genuine
symmetric causal interactions between symptoms. For example, in statistical physics a PRMF, the
Ising model, is used model that particles cause neighboring particles to be aligned.
Before networks can be interpreted as such, however, they must first be constructed. While
this can be rather straightforward in, e.g., social networks (social networking sites now readily supply
a list of connections), train networks (just ask the railway managers), or image processing (pixels next
to each other are connected), it is a challenging task in psychology: what is the structure of a
psychological system such as PTSD? We will refer to the field dedicated to estimating psychological
network structures network psychometrics (Epskamp et al., 2016), and provide a brief overview in the
following section.
Network Psychometrics
Ising model and Gaussian graphical model
When the data is binary, the appropriate PRMF model to use is called the Ising model (van
Borkulo et al., 2014), and requires binary data to be estimated. When the data is multivariate normally
distributed, the appropriate PRMF model is called the Gaussian graphical model (GGM; Costantini,
Epskamp, et al., 2015; Lauritzen, 1996), in which edges can directly be interpreted as partial
correlation coefficients. A GGM in which all possible edges are included in the network is also
termed a partial correlation network, as edge weights then directly equal partial correlation
coefficients. The Ising model can be estimated by using the EstimateIsing function from the R-
package IsingSampler (Epskamp, 2016b) with the option method = “ll”. The GGM, on the other
hand, can be estimated by using the qgraph function from the R-package qgraph (Epskamp et al.,
2012) with the option graph = “pcor” .
When data is neither binary nor multivariate normal, the Ising model and GGM can still be
used. For the Ising model, data need to be binarized (e.g., by median split or based on some
theoretical reasoning). For the GGM, data either need to be transformed to approximate multivariate
normality (e.g., by using the nonparanormal transformation; Liu, Lafferty, & Wasserman, 2009), or,
in case some or all variables are ordinal, an estimate of an underlying normally distributed correlation
STABILITY OF PSYCHOLOGICAL NETWORKS
9
matrix has to be obtained (e.g., by using polychoric correlations; Olsson, 1979). Alternatively, novel
methodology has been developed to estimate a PMRF directly containing both continuous and
categorical variables (Haslbeck & Waldorp, 2015). This model can be estimated using the mgmfit
function from the mgm package in R.
Dealing with the problem of small N in psychological data
Both the Ising model and the GGM feature a severe limitation: the number of parameters to
estimate grows quickly with the size of the network. In a 10-node network, 55 parameters (10
threshold parameters and 10*9/2=45 pairwise association parameters) need be estimated already. This
number grows to 210 in a network with 20 nodes, and to 1275 in a 50-node network. To reliably
estimate that many parameters, the number of observations needed typically exceeds the number
available in characteristic psychological data. While a lot is a loosely defined statement, and future
research is needed to define strict guidelines, a researcher will, at the very minimum, want to measure
at least as many people as there are parameters in the model. With too few observations, the standard
errors of the estimated parameters can become intolerably large and inference on the estimated
network structure becomes pointless.
Recently, network psychometricians have applied the "least absolute shrinkage and selection
operator" (LASSO; Friedman, Hastie, & Tibshirani, 2008) to remediate the large standard errors in
the Ising model and GGM when estimated with relatively little data. This uncertainty of parameters
also implies that some estimated edges in the network may be spurious and only due to chance. A way
to remedy this problem is to use a penalty function, also termed regularization. The LASSO employs
such a regularizing penalty by limiting the total absolute sum of parameter values, leading many edge
estimates to shrink to exactly zero and dropping out of the model. As such, the LASSO returns a
sparse (or, in substantive terms, conservative) network model: only a relatively small amount of edges
are used to explain the covariation structure in the data. Because of this sparsity, the estimated models
become more interpretable. The LASSO utilizes a tuning parameter to control the degree to which
regularization is applied. This tuning parameter can be selected by minimizing the Extended Bayesian
Information Criterion (EBIC; Foygel & Drton, 2010).
STABILITY OF PSYCHOLOGICAL NETWORKS
10
Estimating regularized networks in R is straightforward. For the Ising model, LASSO
estimation using EBIC has been implemented in the IsingFit package (van Borkulo & Epskamp,
2014). For GGM networks, a well-established and fast algorithm for estimating LASSO regularization
is the graphical LASSO (glasso; Friedman et al., 2008), which is implemented in the package glasso
(Friedman, Hastie, & Tibshirani, 2014). The qgraph package—using the qgraph function with the
option graph = "glasso"—runs glasso with tuning parameter selection using the EBIC.
Network inference
In the first step of network analysis, the obtained network is typically visualized to show the
structure of the data (Epskamp et al., 2012). Afterwards, inference methods derived from graph theory
can be applied to the network structure. The estimated PRMF is always a weighted network, which
means that we not only look at the structure of the network (are two nodes connected or not), but also
at the strength of connection between pairs of nodes. Because of this, many typical inference methods
that concern the global structure of the network, such as smallworldness, density and global clustering
(Kolaczyk & Csárdi, 2014; Newman, 2010; Watts & Strogatz, 1998), are less useful in the context of
psychological networks, because they only take into account whether nodes are connected or not, and
not the strength of association among nodes. Because the global inference methods for weighted
networks and PRMFs are still in development and no consensus has been reached yet, the network
inference section focuses on local network properties: how are two nodes related, and what is the
influence of a single node?
Relation between two nodes
The relationship between two nodes can be assessed in two ways. First, one can directly
assess the edge weight. This is always a number that is nonzero (as an edge weight of zero would
indicate there is no edge). The sign of the edge weight (positive or negative) indicates the type of
interaction, and the absolute value of the edge weight indicates the strength of the effect; a positive
edge weight of, for example, 0.5 is equal in strength to a negative edge weight of -0.5, and both are
stronger than an edge weight of 0.2. Two strongly connected nodes influence each other more easily
STABILITY OF PSYCHOLOGICAL NETWORKS
11
than two weakly connected nodes. Similar to how two persons standing closer to each other can
communicate more easily (talking) than two people standing far away from each other (shouting), two
strongly connected nodes are closer to each other. As such, the length of an edge is defined as the
inverse of the edge strength. Finally, the distance between two nodes is equal to the sum of lengths of
all edges on the shortest path between the two nodes (Newman, 2010).
Node centrality
The importance of individual nodes in the network can be assessed by investigating the node
centrality. A visualization of a network—such as the ones shown later in this manuscript in Figure
2—is an abstract rendition of a high-dimensional space in two dimensions. While visualizations of
network models often aim to place highly connected nodes into the center of the graph, for instance
using the Fruchterman-Reingold algorithm (Fruchterman & Reingold, 1991), the two-dimensional
visualization cannot properly reflect the true space of the model. Thus, the metric distance between
the placement of nodes in the two-dimensional space has no direct interpretation as it has in, e.g.,
multidimensional scaling. Therefore, graph theory has developed several methods to more objectively
quantify which node is more central in a network than others. Three such centrality measures have
appropriate weighted generalizations that can be used with psychological networks (Opsahl,
Agneessens, & Skvoretz, 2010).
First, node strength, also called degree in unweighted networks (Newman, 2010), simply adds
up the strength of all connected edges to a node; if the network is made up of partial correlation
coefficients the node strength equals the sum of absolute partial correlation coefficients between a
node and all other nodes. Second, closeness takes the inverse of the sum of all shortest paths between
a node and all other nodes in the network. Thus, where node strength investigates how strongly a node
is directly connected to other nodes in the network, closeness investigates how strongly a node is
indirectly connected to other nodes in the network. Finally, betweenness looks at how many shortest
paths between two nodes go through the node in question; the higher the betweenness, the more
important a node is in connecting other nodes.
STABILITY OF PSYCHOLOGICAL NETWORKS
12
Network stability
The above description is an overview of the current state of network analysis in psychology.
While network inference is typically performed by assessing edge strengths and node centrality, little
work has been done in investigating how stable these inferences are. Imagine we find that symptom A
has a much higher node strength than symptom B in a psychopathological network, leading to the
clinical interpretation that A may be a more relevant target for treatment than the peripheral symptom
B (Fried, Epskamp, Nesse, Tuerlinckx, & Borsboom, 2016). Clearly, this interpretation relies on the
assumption that the centrality estimates are indeed different from each other. Similarly, while two
edges in a network may be differentially strong, it remains unclear how strong the evidence for that
difference is if we do not take into account the variability of the edge weight estimation. We have
developed the free to use R extension bootnet that implements various methods for assessing the
stability of network models. Bootnet comes with convenient default arguments to estimate the Ising
model and the GGM, and outputs easily indexable data frames as well as interpretable plots.
Edge weight bootstrap
To assess the variability of edge weights, we can estimate confidence intervals (CI); in 95%
of the cases such a CI will contain the true value of the parameter. To construct a CI, we need to know
the sampling distribution of the statistic of interest. While such sampling distributions can be difficult
to obtain for complicated statistics such as centrality measures, there is an alternative and simple way
of constructing CIs: bootstrapping (Efron, 1979). Bootstrapping involves repeatedly taking random
samples with replacement of the original data, followed by estimating the statistic of interest.
Following the bootstrap, a 1-α CI can be constructed as the interval between quantiles 1/2α and 1-
1/2α of the bootstrapped distribution.
Bootstrapping edge weights can be done in two ways: using non-parametric bootstrap and
parametric bootstrap (Bollen & Stine, 1993). Both methods involve sampling a series of new plausible
datasets with the same number of observations and then recomputing the statistic of interest over
these new datasets. In non-parametric bootstrapping, observations in the data are resampled with
STABILITY OF PSYCHOLOGICAL NETWORKS
13
replacement to create new plausible datasets, whereas parametric bootstrapping samples new
observations from the parametric model that has been estimated from the original data; this creates a
series of values that can be used to estimate the sampling distribution.
Non-parametric bootstrapping can always be applied, whereas parametric bootstrapping
requires a parametric model of the data. This is no problem in network analysis, as we already have a
parametric model for the data; the estimated network structure fully encodes the joint probability
distribution for the data. When we estimate a GGM, data can be sampled by sampling from the
multivariate normal distribution through the use of the R package mvtnorm (Genz et al., 2015); to
sample from the Ising model, we have developed the R package IsingSampler (Epskamp, 2015b).
Using the GGM model, the parametric bootstrap samples continuous multivariate normal data – an
important distinction from ordinal data if the GGM was estimated using polychoric correlations.
Therefore, we advise the researcher to use the non-parametric bootstrap when handling ordinal data.
Furthermore, the non-parametric bootstrap is fully data-driven and requires no theory, whereas the
parametric bootstrap is more theory driven. As such, it is advisable to first investigate the non-
parametric bootstrap results, and examine the parametric bootstrap results if the non-parametric
results prove unstable or to check for correspondence of the CIs between both methods.
Subset bootstrap methods
While the CIs of edge weights can be constructed using the bootstrap, we discovered in the
process of this research that constructing CIs for centrality indices is far from trivial. As discussed in
more detail in the supplementary materials, bootstrapping centrality indices results in highly biased
sampling distributions, and thus the bootstrap cannot readily be used to construct true 95% CIs. While
intervals can be constructed that give a hint of the sampling variation around centrality indices, they
should thus not be interpreted as CIs. Our suggestion is rather to focus on the edge weight CIs to
obtain insight into the stability of the network estimation. For centrality indices, we suggest to instead
look at the stability of network measures based on subsets of the data. There are two ways to do so.
Taking subsets of people in the dataset employs the so called m out of n bootstrap, which is
commonly used to remediate problems with the regular bootstrap (Chernick, 2011). Taking subsets of
STABILITY OF PSYCHOLOGICAL NETWORKS
14
variables is also a viable option in assessing the stability of centrality measures (Costenbader &
Valente, 2003). While both subsetting methods do not lead to valid CIs, they can be used to assess the
correlation between the original centrality indices and those obtained from subsets, which can guide
the interpretation of stability. If the order of centrality indices of a number of symptoms in the
network completely changes after dropping 10% of the people or nodes, then the order cannot
meaningfully be interpreted.
In network analysis, researchers usually are less interested in discovering the true centrality of
a node than in the order of centrality found in the network: which nodes are the most important? If
that order is not stable, then no meaningful interpretation can be made on the order of centrality in the
network analysis. To assess this stability, we propose to investigate the correlation between the
centrality based on the entire sample and the centrality based on a subset of the sample: the higher this
correlation is, the more stable the results. This can be done in two ways, by dropping nodes from the
data (node-dropping), or by dropping persons from the data (person-dropping). Costenbader and
Valente (Costenbader & Valente, 2003) describe node-dropping in investigating centrality stability,
while person-dropping is comparable to the m out of n bootstrap. We term this general framework,
covering both node-dropping and person-dropping, subset bootstrap.
The bootnet package includes subset bootstrap and produces plots of the correlation of
centrality indices under different levels of subsetting, as described by Costenbader and Valente. The
bootnet package also visualizes the variability around this correlation, showing researchers if
centrality indices are stable in the dataset; if the correlation drasticly changes after only dropping, say,
10% of the nodes or persons, it is very clear that no meaningful interpretation can be made of the
centrality order. In addition to the correlation plot, bootnet can be used to plot the average estimated
centrality index for each node under different sampling levels, giving more detail on the order of
centrality under different subsetting levels.
STABILITY OF PSYCHOLOGICAL NETWORKS
15
Tutorial
We have implemented the non-parametric, parametric, and subset-dropping bootstrap in a
general framework for network analysis in the freely available R package bootnet. Specifically, this
can be used to bootstrap the edge weights, and also obtain information regarding the stability of
various centrality estimates. With bootnet, users can perform non-parametric and node-dropping
bootstrap for any R routine that estimates a PMRF, and parametric bootstrap can be performed for the
Ising model and the GGM. To exemplify the usage of bootnet in both estimating and investigating
network structures, we use a dataset of 359 women enrolled in community-based substance abuse
treatment programs across the United States (study title: Women's Treatment for Trauma and
Substance Use Disorders; study number: NIDA-CTN-0015; URL:
https://datashare.nida.nih.gov/protocol/nida-ctn-0015). All participants met the criteria for either
posttraumatic stress disorder (PTSD) or sub-threshold PTSD, according to the DSM-IV-TR (APA,
2000). Details of the sample, such as inclusion and exclusion criteria as well as demographic
variables, can be found elsewhere (Hien et al., 2009). We estimate the networks using the 17 PTSD
symptoms from the PTSD Symptom Scale-Self
Report (PSS-SR; Foa, Riggs, Dancu, & Rothbaum, 1993). Participants rated the frequency of
endorsing these symptoms on a scale ranging from 0 (not at all) to 3 (at least 4 or 5 times a week).
In a first step, we will show how to estimate the PTSD network. We then continue to
introduce the estimation of centrality indices. Finally, we provide a step-by-step tutorial to perform
the stability analyses introduced above; this will provide users with the possibility to examine the
stability of the edge weights and the stability of the centrality estimates.
Estimating the networks
Following the steps in Appendix A, the data can be loaded into R in a data frame called
FreqBaseline, which contains the frequency ratings at the baseline measurement point. Since the
PTSD symptoms are ordinal, we need to compute a polychoric correlation matrix to base our
networks on. We do so using the cor_auto function from the qgraph package, which automatically
STABILITY OF PSYCHOLOGICAL NETWORKS
16
detects ordinal variables and utilizes the R-package Lavaan (Rosseel, 2012) to compute polychoric
(or, if needed, polyserial and Pearson) correlations:
library("qgraph")
corFreq <- cor_auto(FreqBaseline)
Next, we compute a GGM based on the polychoric correlation matrix among symptoms obtained by
cor_auto. As described above, we can do so either by using the LASSO regularization that sets
small edges to exactly zero, or without this regularization technique. To estimate the model without
regularization, we can use the qgraph function together with the option graph = "pcor". Since
edge weights in this network are equal to partial correlation coefficients, we will refer to this network
as the partial correlation network.
graph_pcor <- qgraph(corFreq, graph = "pcor")
If we decide to compute a LASSO regularized GGM model instead, we can use the following codes;
we will term this network the glasso network.
graph_glasso <- qgraph(corFreq, graph = "glasso", sampleSize =
nrow(FreqBaseline))
We can plot both networks using the same layout as follows:
Layout <- averageLayout(graph_pcor, graph_glasso)
layout(t(1:2))
qgraph(graph_pcor, layout = Layout, labels = TRUE, title = "Partial
correlation network")
qgraph(graph_glasso, layout = Layout, labels = TRUE, title = "glasso
network")
STABILITY OF PSYCHOLOGICAL NETWORKS
17
The averageLayout function averages the node placement (Fruchterman & Reingold, 1991)
obtained from the different estimation methods and places the nodes in the same way in both graphs,
Figure 2. The unregularized partial correlation network (left) and the regularized glasso network
(right) of 17 PTSD symptoms (n=359).
enabling us to compare said networks better with each other. Figure 2 shows these two estimated
networks; the legend of the corresponding nodes can be found in Table 1. When we compare the
networks, we can see that the main connections in both networks are the same, that the partial
correlation network contains negative edges, and that the glasso network is overall much sparser.
Whereas the partial correlation network features all 136 edges, the glasso network only has 78 edges
due to the regularization; other edges are set to exactly 0.
In both networks, especially strong connections emerge among symptoms 3 (being jumpy)
and 4 (being alert), 5 (cut off from people) and 11 (interest loss), and 16 (upset when reminded of the
trauma) and 17 (upsetting thoughts/images). Other connections are very small or absent, for instance
between 7 (irritability) and 15 (reliving the trauma); this implies that these symptoms are statistically
independent when conditioning on all other symptoms.
1
23
4
5
6
7
8
9
1011
12
13
14
15
16
17
glasso network
1
23
4
5
6
7
8
9
1011
12
13
14
15
16
17
Partial correlation network
STABILITY OF PSYCHOLOGICAL NETWORKS
18
Estimating node centrality
Table 1. Node IDs and corresponding symptom names of the 17 PTSD symptoms.ID Variable
1 Avoid reminds of the trauma
2 Bad dreams about the trauma
3 Being jumpy or easily startled
4 Being over alert
5 Distant or cut off from people
6 Feeling emotionally numb
7 Feeling irritable
8 Feeling plans won’t come true
9 Having trouble concentrating
10 Having trouble sleeping
11 Less interest in activities
12 Not able to remember
13 Not thinking about trauma
14 Physical reactions
15 Reliving the trauma
16 Upset when reminded of trauma
17 Upsetting thoughts or images
To investigate centrality indices in both graphs, we can use the centralityPlot function
from the qgraph package:
centralityPlot(list(pcor = graph_pcor, glasso = graph_glasso))
As we can see in Figure 3, nodes differ quite substantially in their centrality estimates, and centrality
indices are rather similar across the two estimated networks. The most central node in the partial
correlation network is node 3 (“Being jumpy or easily startled”) in all three measures, whereas it is
only the node with the highest closeness in the glasso network; node 17 (“Upsetting thoughts or
STABILITY OF PSYCHOLOGICAL NETWORKS
19
image”) has the highest strength and betweenness in the glasso network. However, without knowing
the stability of the centrality estimates, we cannot conclude whether the differences of centrality
estimates within a network––or the differences across networks––are meaningful or not.
Figure 3: Centrality indices of partial correlation and glasso networks of 17 PTSD symptoms.
Stability analysis
To investigate the stability of edge weights and centrality estimates, we have to load the
bootnet package:
library("bootnet")
The main function of bootnet is the function bootnet, which performs the bootstrapping of
networks using various estimation methods. Bootnet comes with convenient default settings for the
most commonly used network psychometric methods. The default argument of bootnet can be set to
"pcor" or "EBICglasso" to estimate a GGM using cor_auto followed by nonregularized
(partial correlation network) or regularized (glasso network) estimation of the GGM exactly as shown
Betweenness Closeness Strength
● ●
●●
●●
● ●
●●
●●
● ●
●●
●●
● ●
● ●
●●
● ●
● ●
●●
●●
● ●
● ●
●●
● ●
● ●
●●
● ●
● ●
●●
●●
●●
● ●
●●
● ●
● ●
●●
●●
●●
● ●
●●
●●
●●
● ●
●●
● ●
●●
● ●
●●
● ●
●●
● ●
● ●
●●
●●
●●
1234567891011121314151617
−1 0 1 2 −2 −1 0 1 2 −2 −1 0 1 2
type●
●
glassopcor
STABILITY OF PSYCHOLOGICAL NETWORKS
20
above. Next, the type argument can be used to specify the type of bootstrap: "nonparametric"
for the non-parametric bootstrap, "parametric" for the parametric bootstrap, "node" for the
node-dropping bootstrap, and "person" for the person dropping bootstrap.
Edge weight bootstrap
In our case, the PTSD symptoms are ordinal, and as explained above we can only use the non-
parametric bootstrap. The following codes bootstrap the partial correlation network:
nonParametricBoot_pcor <- bootnet(FreqBaseline, default = "pcor",
type = "nonparametric")
Similarly, the glasso network can be bootstrapped as follows:
nonParametricBoot_glasso <- bootnet(FreqBaseline, default =
"EBICglasso", type = "nonparametric")
The print method of these objects gives an overview of characteristics of the sample network (e.g.,
the number of estimated edges) and tips for further investigation, such as how to plot the estimated
sample network or any of the bootstrapped networks. The summary method can be used to create a
summary table of certain statistics containing quantiles of the bootstraps. In this paper, we will only
demonstrate the plot method, which can be used to create a number of plots showing the confidence
intervals of network statistics. To do so, we can obtain the estimated 95% CIs for estimated edge
parameters for both the partial correlation network and the glasso network as follows:
plot(nonParametricBoot_pcor, labels = FALSE, order = "sample")
plot(nonParametricBoot_glasso, labels = FALSE, order = "sample")
Figure 4 shows the resulting plots and reveals sizable CIs around the estimated edge parameters. Of
note, the edges 16 (“Upset when reminded of trauma”) — 17 (”Upsetting thoughts or images”), 3
(“Being jumpy or easily startled”) — 4 (“Being over alert”) and 5 (“Distant of cut off from people”)
— 11 (“Less interest in activities”) are reliably the three strongest edges in both networks. However,
STABILITY OF PSYCHOLOGICAL NETWORKS
21
the large confidence intervals imply that interpreting the order of most edges in the network should be
done with care. Interestingly, the partial correlation network shows confidence intervals of equal size
for most edges, whereas confidence intervals in the glasso network shrink to be much smaller for the
Figure 4: Bootstrapped confidence intervals of estimated edge weights for the partial correlation
network (left) and the glasso network (right) of 17 PTSD symptoms. The red line indicates the sample
values and the gray area the 95% CIs.
edges that are set to zero; this means that the glasso network more reliably estimates the edge
parameters that are on or near zero.
Subset bootstrap
In the next step, we want to drop nodes from the network to examine whether this impacts on
the centrality estimates. We can perform the node-dropping bootstrap on the partial correlation and
glasso networks as follows:
nodeDropBoot_pcor <- bootnet(FreqBaseline, default = "pcor", type =
"node")
nodeDropBoot_glasso <- bootnet(FreqBaseline, default = "EBICglasso",
type = "node")
STABILITY OF PSYCHOLOGICAL NETWORKS
22
The plot method can again be used to investigate the results; by default, the average correlation
between centrality indices from the different sampling levels and the original dataset is plotted:
plot(nodeDropBoot_pcor)
plot(nodeDropBoot_glasso)
Figure 5 shows that the correlation drops steadily in the partial correlation network, whereas the
correlation of mostly the node strength statistic remains stable in the glasso network up to 50%
sampling level. This means that even when half of the nodes are dropped, the order of the node
strength centrality remains fairly stable, implying a robust estimation of node strength.
Figure 5: Average correlations between centrality indices of networks sampled with nodes dropped
and the original sample using partial correlation networks (left) and glasso networks (right).
Next, we can investigate the centrality of individual nodes when dropping nodes. We only examine
node strength centrality in the following plots, but betweenness and closeness can also be
investigated:
plot(nodeDropBoot_pcor, "strength", perNode = TRUE, legend = FALSE)
plot(nodeDropBoot_glasso, "strength", perNode = TRUE, legend =
FALSE)
STABILITY OF PSYCHOLOGICAL NETWORKS
23
Figure 6 shows that the order of node strength estimates is much harder to interpret in the partial
correlation network than in the glasso network (the scale on the y axis is different because glasso
always estimated edge weights closer or equal to zero than the partial correlation network).
Figure 6: Node strength estimates of individual nodes (colored lines) of the partial correlation
network (left) and the glasso network (right) after node-dropping.
We can repeat the same steps for person-dropping by using type = “person” in bootnet:
personDropBoot_pcor <- bootnet(FreqBaseline, default = "pcor", type
= "person")
personDropBoot_glasso <- bootnet(FreqBaseline, default =
"EBICglasso", type = "person")
The same codes as above create the plots for person dropping, which are shown in Figures 7 and 8.
These plots, again, show that only strength in the glasso network can meaningfully be interpreted.
A comparison of Figures 6 and 8 reveals that centrality goes down when subsampling nodes,
but up when subsampling people. Since centrality quantifies how much influence a node has, it is
natural to see that centrality decreases if a node has less potential other nodes to influence. In contrast,
●
●
●
●
●●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●●
●
●●
●
●
●
●
●
●●
●●
●
●●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●●●
●
●●
●●
0.5
1.0
1.5
2.0
strength
90% 80% 70% 60% 50% 40% 30% 20% 10%Sampled nodes
●
●
●●
●
●
●●●●
●●●●
●●
●
●●●●
●●●●
●●●
●●●●
●
●
●
●
●●
●
●
●●●●●●●●
●
●
●
●
●●●
●●●●●●●●
●
●
●
●
●
●
●●●
●●
●●●●
●●
●
●
●
●●
●●
●
●●●●●
●●●
●
●
●
●●
●●
●●
●●●●●●●
●
●
●
●
●
●
●●
●
●
●●●
●●●
●
●
●
●
●
●
●
●
●●●
●
●●
●●
●
●
●
●●
●●●
●●
●●●●●●
●
●
●
●
●
●
●
●●●
●
●
●●
●●
●
●
●
●●
●
●●
●●
●●
●●●●
●
●
●
●
●
●
●
●
●●
●●
●●●
●
●
●●
●●
●●
●
●●●
●●●●●
●
●
●
●
●●
●●
●●●●●
●●●
●
●
●●
●
●●●
●●●●
●●●●
●
●
●●
●
●
●●
●●
●●
●●●●
0.00
0.25
0.50
0.75
1.00
strength
90% 80% 70% 60% 50% 40% 30% 20% 10%Sampled nodes
STABILITY OF PSYCHOLOGICAL NETWORKS
24
centrality increases when dropping participants because if the true network is sparse, taking smaller
samples leads to larger sampling areas around zero, leading to on average higher absolute edge
weights (and thus higher centrality measures).
Figure 7: Average correlations between centrality indices of networks sampled with persons dropped
and the original sample using partial correlation networks (left) and glasso networks (right).
Figure 8: Node strength estimates of individual nodes (colored lines) of the partial correlation
network (left) and the glasso network (right) after person-dropping.
●
●●
●
●
●●
●
●
●
●●
●●
●
●●
●
●
●●
●
●●
●
●
●●
●
●●
●●
●●
●
●●
●
●
●●
●
●
●
●●
●●
●
●
●
●
●●
●
●
●
●●
●
●●
●
●
●●●
●●
●
●●
●●●●
●
●
●
●●
●●
●
●
●
●
●●
●
●
●
●●
●●● ●
●●
●●
●●
●
●
●
●●
●●
●
●
●
●
●
●
●
1.0
1.5
2.0
2.5
strength
90% 80% 70% 60%Sampled people
●●●●●●●
●
●●●●
●
●
●●●●●●
●
●
●●
●●●
●
●●●
●●●●
●●●●●●●
●
●●●
●●●
●●●●●●●
●●●●●●●
●●●
●●●●
●●
●●●●●
●
●●●●●
●
●●●●●●●
●●●
●●●●
●●
●●●●
●
●
●●●●●●
●●
●●●●
●
0.4
0.6
0.8
1.0
1.2
strength
90% 80% 70% 60%Sampled people
STABILITY OF PSYCHOLOGICAL NETWORKS
25
Flexible specification of bootnet
In addition to the default sets for glasso and partial correlation networks, bootnet sports two
more default sets to similarly estimate the Ising model. The default sets "IsingLL" and
"IsingFit" can be used for, respectively, nonregularized estimation (via IsingSampler by using a
formulization of the Ising model as a loglinear model; Agresti, 1990; Epskamp et al., 2016) and
regularized estimation (van Borkulo et al., 2014) of the Ising model. Furthermore, the user can
manually set the estimation method of the PMRF in bootnet using a set of arguments to the bootnet
function. First, the method of preprocessing the data must be defined via the prepFun argument,
which must be assigned a function that takes a dataset as input and returns the viable input for the
network estimator. The argument prepArgs can be specified a list of arguments to the prepFun
function. Data preprocessing typically means correlating the data for the GGM or binarizing it for the
Ising model (to this end bootnet provides a binarize function). Next, we estimate the network. To
do so, we assign the estFun argument, a function that takes whatever the output of prepFun was
and estimates a network. The estArgs argument can be used to assign a list of additional arguments
to the function used in estFun. Finally, we need to extract the weights matrix and intercepts.
Assigning functions to the graphFun and intFun arguments respectively can do this. An example
of how these commands work together to run the same bootstrap of the glasso network as is done
above is shown below:
bootnet(FreqBaseline,
prepFun = cor_auto,
prepArgs = list(verbose = FALSE),
estFun = qgraph::EBICglasso,
estArgs = list(n = nrow(bfi)),
graphFun = identity,
intFun= null)
STABILITY OF PSYCHOLOGICAL NETWORKS
26
Conclusion
In this paper, we have summarized the state-of-the-art in psychometric network modeling, and
have shown how to estimate network models can be estimated and discussed inference methods that
can be used after these models are obtained. Furthermore, we have described, for the first time, how
the stability of these inference methods can be assessed using non-parametric, parametric and subset
bootstrap methods. To aid the researcher in this endeavor, we have developed the freely available R
package bootnet. It is of note that, while we demonstrate the functionality of bootnet in this tutorial
using a GGM network, the package can be used for any estimation technique in R that estimates a
PMRF (such as the Ising model).
The stability analysis of a 17-node symptom network of 359 women with PTSD or sub-
threshold PTSD indicated that glasso network more reliably estimated the edge weights than the
partial correlation network. Centrality stability analysis, on the other hand, revealed that only node
strength based on LASSO regularized GGM networks could reliably be interpreted. Furthermore,
subset bootstrapping showed that the centrality indices of the glasso network were much more
consistent after dropping nodes or persons than the centrality indices of the partial correlation
network. These results are in line with expectations, as the LASSO is designed to yield more stable
inferences in complex models with limited data.
The results highlight the importance of stability analysis when estimating psychological
networks. Figure 3 seemed to indicate that both the non-regularized and the regularized networks are
similar in terms of centrality indices. However, had we stopped analyzing the data at this point, then
we would have erroneously concluded that results of both network estimation methods can be used
for network inference, and that the centrality indices are reliably estimated. However, further analyses
revealed that, in this dataset, only the ordering of the node strength statistic in the glasso network can
reliably be interpreted, and that interpretations of the order of symptoms based on closeness or
betweenness centrality are heavily susceptible to sampling error.
STABILITY OF PSYCHOLOGICAL NETWORKS
27
Future directions
We only focused on stability analysis of cross-sectional network models. Assessing variability
on longitudinal and multi-level models is much more complicated and beyond the scope of current
paper; it is also not implemented in bootnet as of yet. We refer the reader to Bringmann et al. (2015)
for a demonstration on how confidence intervals can be obtained in a longitudinal multi-level setting.
We also want to point out that the results obtained here, for instance that the glasso method provides a
more robust estimation of the network structure and the centrality estimates than the partial
correlation procedure, may be idiosyncratic to the particular data used. Simulation studies will be
required to investigate how these and similar procedures perform in terms of stability when
parameters such sample size and number of nodes vary. Future development of bootnet will be aimed
to implement functionality for a broader range of network models, and we encourage readers to
submit any such ideas or feedback to the Github Repository
(www.github.com/sachaepskamp/bootnet).
While working on this project, two new research questions emerged: is it possible to form an
unbiased estimator for centrality indices in partial correlation networks, and consequently, how should
true 95% confidence intervals around centrality indices be constructed? As our example highlighted,
centrality indices can be highly unstable due to sampling variability, and the estimated sampling
distribution of centrality indices can be severely biased. At present, we have no definite answer to
these pressing questions. Given the current emergence of network modeling in psychology,
development of adequate CIs for centrality indices should thus have high priority.
Network stability has been a blind spot in psychological network analysis, and the authors are
aware of only one prior paper that has examined network stability (Fried et al., 2016). Remediating
this blind spot is of utmost importance if network analysis is to be added as a full-fletched
methodology to the toolbox of the psychological researcher.
STABILITY OF PSYCHOLOGICAL NETWORKS
28
References
Agresti, A. (1990). Categorical data analysis. New York, NY: John Wiley & Sons. APA. (2000). Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition, text revision.
Washington, DC: American Psychiatric Association. Barzel, B., & Biham, O. (2009). Quantifying the connectivity of a network: The network correlation
function method. Physical Review E - Statistical, Nonlinear, and Soft Matter Physics, 80(4). http://doi.org/10.1103/PhysRevE.80.046104
Bollen, K., & Stine, R. (1993). Bootstrapping goodness-of-fit measures in structural equation models. In K. Bollen & J. Long (Eds.), Testing structural equation models (pp. 111–135). Newbury Park, CA: Sage.
Borsboom, D., & Cramer, A. O. J. (2013). Network Analysis: An Integrative Approach to the Structure of Psychopathology. Annual Review of Clinical Psychology, 9(1), 91–121. http://doi.org/10.1146/annurev-clinpsy-050212-185608
Bringmann, L. F., Lemmens, L. H. J. M., Huibers, M. J. H., Borsboom, D., & Tuerlinckx, F. (2015). Revealing the dynamic network structure of the Beck Depression Inventory-II. Psychological Medicine, 45(4), 747–57. http://doi.org/10.1017/S0033291714001809
Bringmann, L. F., Vissers, N., Wichers, M., Geschwind, N., Kuppens, P., Peeters, F., … Tuerlinckx, F. (2013). A network approach to psychopathology: new insights into clinical longitudinal data. PloS One, 8(4), e60188.
Chernick, M. R. (2011). Bootstrap methods: A guide for practitioners and researchers (Second Edi). Hoboken, New Jersey: John Wiley & Sons.
Costantini, G., Epskamp, S., Borsboom, D., Perugini, M., Mõttus, R., Waldorp, L. J., & Cramer, A. O. J. (2015). State of the aRt personality research: A tutorial on network analysis of personality data in R. Journal of Research in Personality, 54, 13–29.
Costantini, G., Richetin, J., Borsboom, D., Fried, E. I., Rhemtulla, M., & Perugini, M. (2015). Development of indirect measures of conscientiousness: Combining a facets approach and network analysis. European Journal of Personality, 29, 548–567. http://doi.org/10.1002/per.2014
Costenbader, E., & Valente, T. W. (2003). The stability of centrality measures when networks are sampled. Social Networks, 25, 283–307. http://doi.org/10.1016/S0378-8733(03)00012-1
Cramer, A. O. J., Waldorp, L. J., van der Maas, H. L. J., & Borsboom, D. (2010). Comorbidity: a network perspective. Behavioral and Brain Sciences, 33(2-3), 137–150.
Efron, B. (1979). Bootstrap Methods: Another Look at the Jackknife. The Annals of Statistics, 7(1), 1–26.
Epskamp, S. (2015a). bootnet: Bootstrap methods for various network estimation routines. Epskamp, S. (2015b). IsingSampler: Sampling Methods and Distribution Functions for the Ising
Model. Epskamp, S., Cramer, A. O. J., Waldorp, L. J., Schmittmann, V. D., & Borsboom, D. (2012). qgraph :
Network Visualizations of Relationships in Psychometric Data. Journal of Statistical Software, 48(4), 1–18. Retrieved from http://www.jstatsoft.org/v48/i04
Epskamp, S., Maris, G., Waldorp, L. J., & Borsboom, D. (2016). Network Psychometrics. In P. Irwing, D. Hughes, & T. Booth (Eds.), Handbook of Psychometrics. New York: Wiley.
Foa, E. B., Riggs, D. S., Dancu, C. V, & Rothbaum, B. O. (1993). Reliability and validity of a brief instrument for assessing post-traumatic stress disorder. Journal of Traumatic Stress, 6(4), 459–473.
STABILITY OF PSYCHOLOGICAL NETWORKS
29
Foygel Barber, R., & Drton, M. (2015). High-dimensional Ising model selection with Bayesian information criteria. Electronic Journal of Statistics, 9(1), 567–607.
Foygel, R., & Drton, M. (2010). Extended Bayesian Information Criteria for Gaussian Graphical Models. Arxiv Preprint, 1–14.
Fried, E. I. (2015). Problematic assumptions have slowed down depression research: why symptoms, not syndromes are the way forward. Frontiers in Psychology, 6(306), 1–11. http://doi.org/10.3389/fpsyg.2015.00309
Fried, E. I., Bockting, C., Arjadi, R., Borsboom, D., Amshoff, M., Cramer, O. J., … Stroebe, M. (2015). From loss to loneliness: The relationship between bereavement and depressive symptoms. Journal of Abnormal Psychology, 124(2), 256–265.
Fried, E. I., Epskamp, S., Nesse, R. M., Tuerlinckx, F., & Borsboom, D. (2016). What are “good” depression symptoms? Comparing the centrality of DSM and non-DSM symptoms of depression in a network analysis. Journal of Affective Disorders, 189, 314–320.
Friedman, J. H., Hastie, T., & Tibshirani, R. (2008). Sparse inverse covariance estimation with the graphical lasso. Biostatistics, 9(3), 432–441.
Friedman, J., Hastie, T., & Tibshirani, R. (2014). glasso: Graphical lasso-estimation of Gaussian graphical models.
Fruchterman, T., & Reingold, E. (1991). Graph drawing by force-directed placement. Software: Practice and Experience, 21(11), 1129–64.
Genz, A., Bretz, F., Miwa, T., Mi, X., Leisch, F., Scheipl, F., & Hothorn, T. (2015). mvtnorm: Multivariate Normal and t Distributions.
Goekoop, R., & Goekoop, J. G. (2014). A Network View on Psychiatric Disorders: Network Clusters of Symptoms as Elementary Syndromes of Psychopathology. PloS One, 9(11), e112734. http://doi.org/10.1371/journal.pone.0112734
Haslbeck, J. M. B., & Waldorp, L. J. (2015). Structure estimation for mixed graphical models in high-dimensional data, (January).
Hien, D. A., Wells, E. A., Jiang, H., Suarez-Morales, L., Campbell, A. N. C., Cohen, L. R., … Nunes, E. V. (2009). Multi-site randomized trial of behavioral interventions for women with co-occurring PTSD and substance use disorders. Journal of Consulting and Clinical Psychology, 77(4), 607–619. http://doi.org/10.1037/a0016227
Kalisch, M., Martin, M., Colombo, D., Hauser, A., Maathuis, M. H., & Bühlmann, P. (2012). Causal Inference Using Graphical Models with the R Package pcalg. Journal of Statistical Software, 47(11), 1–26.
Kolaczyk, E. D., & Csárdi, G. (2014). Statistical analysis of network data with R (Vol. 65). Springer.
Lauritzen, S. L. (1996). Graphical Models. Clarendon Press. Liu, H., Lafferty, J. D., & Wasserman, L. (2009). The nonparanormal: Semiparametric estimation of
high dimensional undirected graphs. The Journal of Machine Learning Research, 10, 2295–2328.
MacCallum, R. C., & Browne, M. W. (1993). The use of causal indicators in covariance structure models: some practical issues. Psychological Bulletin, 114(3), 533–41.
Mcnally, R. J., Robinaugh, D. J., Gwyneth, W. Y., Wang, L., Deserno, M. K., & Borsboom, D. (2014). Mental Disorders as Causal Systems : A Network Approach to Posttraumatic Stress Disorder. Clinical Psychological Science, 1–14. http://doi.org/10.1177/2167702614553230
Newman, M. (2010). Networks: an introduction. Oxford University Press.
Olsson, U. (1979). Maximum likelihood estimation of the polychoric correlation coefficient. Psychometrika, 44(4), 443–460. http://doi.org/10.1007/BF02296207
Open Science Collaboration. (2015). Estimating the reproducibility of psychological science. Science,
STABILITY OF PSYCHOLOGICAL NETWORKS
30
349(6251), aac4716–aac4716. http://doi.org/10.1126/science.aac4716
Opsahl, T., Agneessens, F., & Skvoretz, J. (2010). Node centrality in weighted networks: Generalizing degree and shortest paths. Social Networks, 32(3), 245–251.
Pearl, J. (2000). Causality: models, reasoning and inference (Vol. 29). Cambridge, UK: Cambridge Univ Press.
R Development Core Team. (2008). R: A language and environment for statistical computing. Vienna: R Foundation for Statistical Computing.
Rosseel, Y. (2012). lavaan: An R Package for Structural Equation Modeling. Journal of Statistical Software, 48(2), 1–36.
Schmittmann, V. D., Cramer, A. O. J., Waldorp, L. J., Epskamp, S., Kievit, R. A., & Borsboom, D. (2013). Deconstructing the construct: A network perspective on psychological phenomena. New Ideas in Psychology, 31(1), 43–53.
Scutari, M. (2010). Learning Bayesian Networks with the bnlearn R Package. Journal of Statistical Softwa, 35(3), 1–22.
van Borkulo, C. D., Borsboom, D., Epskamp, S., Blanken, T. F., Boschloo, L., Schoevers, R. a, & Waldorp, L. J. (2014). A new method for constructing networks from binary data. Scientific Reports, 4, 5918. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/25082149\nhttp://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=4118196&tool=pmcentrez&rendertype=abstract
van Borkulo, C. D., & Epskamp, S. (2014). IsingFit.
Wasserman, S., & Faust, K. (1994). Social network analysis: Methods and applications. Structural analysis in the social sciences (Vol. 8). http://doi.org/10.1093/infdis/jir193
Watts, D. J., & Strogatz, S. H. (1998). Collective dynamics of “small-world” networks. Nature, 393(6684), 440–2. http://doi.org/10.1038/30918
STABILITY OF PSYCHOLOGICAL NETWORKS
31
Appendix A: Loading the PTSD dataset
To download the dataset, go to https://datashare.nida.nih.gov/protocol/nida-ctn-0015 and click on
"CTN-0015 Data Files". The relevant data file is called “qs.csv”, which can be loaded into R by using
the default read.csv function:
Data <- read.csv("qs.csv", stringsAsFactors = FALSE)
This loads the data in long format, which contains a column with subject id's, a column with the
names of the administered items, and a third column containing the item responses. For network
analysis, we need the data to be in wide format. Furthermore, we need to assign that the response
"NOT ANSWERED" indicates a missing response and other responses are ordinal. Finally, we need
to extract relevant dataset at baseline measure for the PTSD symptom frequency scores. To do this,
we can utilize the dplyr and tidyr R packages as follows:
# Load packages:
library("dplyr")
library("tidyr")
# Frequency at baseline:
FreqBaseline <- Data %>%
filter(EPOCH == "BASELINE", grepl("^PSSR\\d+A$",QSTESTCD)) %>%
select(USUBJID,QSTEST,QSORRES) %>%
spread(QSTEST, QSORRES) %>%
select(-USUBJID) %>%
mutate_each(funs(replace(.,.=="NOT ANSWERED",NA))) %>%
mutate_each(funs(ordered(.,c("NOT AT ALL","ONCE A WEEK","2-4 TIMES
PER WEEK/HALF THE TIME", "5 OR MORE TIMES PER WEEK/ALMOST
ALWAYS"))))%>%
'names<-'(seq_len(ncol(.)))
STABILITY OF PSYCHOLOGICAL NETWORKS
32
Supplementary materials: The bootstrap and centrality indices
The bootstrap is not a foolproof methodology that always works. There are, in fact, many
cases in which the bootstrap fails (Chernick, 2011). One such case is when the parameter of interest
has an expected value at the boundary of the parameter space. In bootstrapping edge weights (as
described in the main manuscript above), such as partial correlation coefficients, this is not a problem
because edge weights at the boundary of 1 or -1 rarely occur. However, when continuing network
inference, we do not use the edge weights directly but instead the edge strength: the absolute value of
the edge weight. Here, a boundary problem is present, as we expect many edges to have an expected
value of exactly zero (implying the absence of an edge). To exemplify, suppose we find an edge
weight of 0.01 in our original sample, with a standard error of 0.1. When bootstrapping, 95% of the
samples should fall roughly in the interval -0.19 to 0.21; the absolute value of these samples will,
almost always, be higher than the original sampled edge strength.
We discovered a two-sided bias when many edge weights are expected to equal zero (sparse
network structure). First, absolute edge weights, and as a result centrality indices, are highly biased
estimators of the true parameter. This makes sense: if the true edge weight is zero, any estimated
absolute edge weight based on a sample will be higher than zero. Second, bootstrapped absolute edge
weights are biased as well when the expected value is zero, because the bootstrap sampling
distribution is likely to be centered near zero leading to many higher absolute edge weights. To
exemplify, we simulated 10,000 pairs of independent variables (true correlation of zero) and
investigated the sample absolute correlation as well as the absolute correlation of a single
nonparametric and parametric bootstrap. The sample correlation was on average 0.081 higher than the
true correlation of 0, the nonparametric bootstrap was on average 0.032 higher than the sampled
absolute correlation and the parametric bootstrap was on average 0.033 higher than the sampled
absolute correlation.
Computing node strength takes a sum and computing closeness takes the inverse of a more
complicated sum of these biased values; adding up these biases leads to highly inaccurate estimates of
the true parameter. Indeed, we found that it is not uncommon for quantiles of the bootstrapped
strength and closeness to not contain the true parameter at all, nor even the parameter value based on
STABILITY OF PSYCHOLOGICAL NETWORKS
33
the original sample. This bias seemed to be mostly present using non-regularization estimation
methods and low sample sizes. Thus, constructing true CIs for centrality indices—containing the true
parameter 95% of the times—is a challenging research question, involving correcting for bias of the
sample due to true network structure and sample size as well as correcting for the bias in the bootstrap
samples. The current paper does not aim to solve this highly technical question, as we are not so much
interested in estimating a region around the true parameter value as that we are interested in showing
the sampling variability of these centrality indices, even if they are biased.
As explained above, the bootstrap cannot be used to form 95% CIs on centrality indices.
However, we can use the bootstrap to show the reader the importance of taking centrality instability
into account. If we assume that the estimated partial correlation matrix is the true network structure,
then the parametric bootstrap can be used to show the sampling distribution of such a network model:
ParametricBoot_pcor <- bootnet(FreqBaseline, 100,default = "pcor",
type = "parametric")
The 95% quantile regions can be plotted using the following codes:
plot(ParametricBoot_pcor, statistics = c("strength", "closeness",
"betweenness"), CIstyle = "quantiles")
which will return a warning on that the intervals cannot be interpreted as 95% CIs. The argument
CIstyle = "quantiles" makes sure quantiles are plotted. By default, bootnet plots intervals based on the
sample centrality plus and minus two times the standard deviation of the bootstraps. Which are more
interpretable, even though they still are not true 95% CIs.
Figure S1 shows the resulting plot. If the partial correlation network used is the true model,
then the red line shows the true centrality indices and the gray area the 95% quantile region of the
sampling distribution. The placement of the red line indicates the aforementioned bias in both
bootstrapping and sampling of centrality indices. The size of the gray area is large in all indices,
indicating that the order of centrality indices from any sample would be very hard to interpret. In the
sample on which these plots are based, none of the 1000 samples captured the true order of any of the
centrality indices, and the probability that the node with the true strongest centrality was correctly
detected was only 43%, 36% and 33% for strength, closeness and betweenness respectively.
STABILITY OF PSYCHOLOGICAL NETWORKS
34
These plots only show the sampling distribution if the used network is the true model---which
it is not. Therefore, these plots should be interpreted with care. However, they do highlight the
importance of stability analysis on centrality indices, as the sampling distribution of the true model is
likely not much smaller than the ones shown in these plots.
Figure S1: Parametric bootstrap on centrality indices based on the partial correlation network.
Supplementary references:
Chernick, M. R. (2011). Bootstrap methods: A guide for practitioners and researchers (Second Edi.).
Hoboken, New Jersey: John Wiley & Sons.