NIHMS266526 Supplement Supplemental Online Materials

Supplementary Online Materials for:

Formation of regulatory patterns during signal propagation in a mammalian cellular network

Avi Ma’ayan, Sherry L. Jenkins, Susana Neves, Anthony Hasseldine, Elizabeth Grace, Benjamin

Dubin-Thaler, Narat J. Eungdamrong, Gehzi Weng, Prahlad Ram, J. Jeremy Rice,

Aaron Kershenbaum, Gustavo A. Stolovitzky, Robert D. Blitzer, and Ravi Iyengar

Contents Methods Supporting text Supporting Figures S1- S11 Supporting Table S1 Supporting references Author contributions Table of supporting external files: text files, source code segments, movies and spreadsheets Methods Constructing the network: We used published research literature to identify the key components of signaling pathways and cellular machines, and their binary interactions. Most components (~80%) have been described in hippocampal neurons or related neuronal cells. Other components are from other cells, but are included because they are key components in processes known to occur in hippocampal neurons, such as translation. We then established that these interactions were both direct and functionally relevant. All of the connections were individually verified by at least one of the authors of this paper by reading the relevant primary paper(s). We developed a system made of 545 components (nodes) and 1259 links (connections). We used arbitrary but consistent rules to sort components into various groups. For instance, transcription factors are considered a as part of the transcriptional machinery, although it may also be equally valid to consider them as the most downstream component of the central signaling network. Similarly the AMPA receptor-channel (AMPAR) is considered part of the ion channels in the electrical response system since its activity is essential to defining the postsynaptic response, although it binds to and is activated by glutamate, and hence can be also considered a ligand gated receptor-channel in the plasma membrane.

The links were specified by two criteria: function and biochemical mechanism. Three types of functional links were specified. This follows the rules used for representation of pathways in

2

Science’s STKE (S1). Links may be activating, inhibitory or neutral. Neutral links do not specify directionality between components, and are mostly used to represent scaffolding and anchoring undirected or bidirectional interactions. The biochemical specification includes defining the reactions as non-covalent binding interactions or enzymatic reactions. Within the enzymatic category, reactions were further specified as phosphorylation, dephosphorylation, hydrolysis, etc. These two criteria for specification are independent and were defined for all interactions. For the analyses in this study we only used the functional criteria: activating, inhibitory or neutral specifications.

We chose papers that demonstrated direct interactions that were supported by either biochemical or physiological effects of the interactions. From these papers we identified the components and interactions that make up the system we analyzed. During this specification process we did not consider whether these interactions would come together to form higher order organizational units. Each component and interaction was validated by a reference from the primary literature (1202 papers were used). A list of authors who read the papers to validate the components and interactions is provided under authors contributions. Storage of the network data: The data describing the components (nodes) and their interactions (links) that compose the network are stored in a text file, see Text file S1, using the flat file *sig format described below. The *sig flat text file format: Source Name: cellular component that is affecting a target component Source Human Accession: Swiss-Prot accession if available Source Mouse Accession: Swiss-Prot accession if available Source Type: the type of molecule classification this component Source Location: cellular localization of the component Target Name: generalized cellular component that is affected by the source component Target Human Accession: Swiss-Prot accession if available Target Mouse Accession: Swiss-Prot accession if available Target Type: the type of molecule classification for this component Target Location: generalized cellular localization of the component Effect: activation (+), inhibition (-), or neutral (0) Type of Interaction: type of chemical interaction directly linking the two components PubMed ID: PubMed database accession number Sorting cellular components (nodes) into cellular machines: Components (nodes) were separated into functional machines based on their molecule type and location attributes using Code segment S7. Visualization of the entire network: The diagram of the network in figure S1 was created with Code segment S8 that creates Text file S2. This text file can be loaded into the Pajek software for visualization (S2). Subnetworks obtained by signal propagation from ligands: To generate these subnetworks we used a depth-first search algorithm expanding the search solely in the downstream direction, to confirm with biologically specified directional transfer of information.

3

For example, when a Gs protein is activated by an upstream ligand-receptor interaction it connects only to the downstream component adenylyl cyclase. The subnetworks do not include another Gs coupled receptor unless that receptor also interacts with the specific ligand. The recursive algorithm (using depth-first search) is listed as Code segment S9. It was used to create 15 subnetworks for each node of the type “Ligand” by changing the number of steps from the ligand. This concept is termed breadth-first search and is illustrated in Movie S20. The subnetworks were then analyzed by computing the number of nodes, links, clustering, average path length and motifs. Visualization of the resulting data for connections per step for all ligands, Spreadsheet S23, was created with MatlabTM, Natick, MA (S3). Three sets of subnetworks induced by the nodes: GLUTAMATE, NE and BDNF were analyzed for number of nodes and links, clustering, average path-length and motifs with Code segment S10. These results, Spreadsheet S24, were visualized with ExcelTM (S4). Overall statistics of the network: The connectivity distribution data, Spreadsheet S25, was plotted with MatlabTM, Natick, MA (S3). Characteristic path-length measure was implemented by using Floyd’s algorithm (S5), with Code segment S11. Clustering Coefficient was implemented with custom code based on the concept developed by Watts and Strogatz (S6), using Code segment S12. Grid Coefficient was implemented with custom code based on the concept developed by Caldarelli et al. (S7), using Code segment S13. The results, Spreadsheet S26, were compared to the same parameters computed for 100 shuffled networks of one island that maintain the same connectivity distribution using an algorithm developed by Milo et al. (S8), using Code segment S14. Network Motifs: Counting motifs in the network was accomplished by using the MFinder program developed by Kashtan et al. (S9). Motifs of size 3, 4, 5 and 6 were visualized using VisioTM (S10). See Table S29 of the complete results for the motifs of sizes 3 and 4. Output files in text format from the MFinder program: Motifs size 3: Text file S3. Motifs size 4: Text file S4. Motifs size 5: Text file S5. Motifs size 6: Text file S6. Our method for motif search: In our detailed analysis of subnetworks, we only considered closed loop circular motifs. We searched for these motifs using a depth-first search algorithm (Code segments S17 through S19). An example of counting motifs in a toy network is given below:

4

A

C D

B

The above toy network contain: 1 positive feedback loop of size 3, 1 negative feedback loop of size 3 and 1 negative feed-forward motif of size 4. Nodes C and D contribute to 3 motifs whereas nodes B and C contribute to 2 motifs. This is slightly different from the method used by Kashtan et al. (S9) because, with their method, A, B, C, and D will not be considered a feedforward loop due to the link between A and D. Positive vs. negative feedback and feedforward loops: Positive feedback loops motifs are defined as loops where all the links in the loop are positive (activating), or there are an even number of negative (inhibitory) links. Negative feedback loops are loops where there is an odd number of negative links. Feedforward loops have two “arms”, each starting at a source node and merging into a target (sink) node. To determine if a feedforward loop is positive or negative, each “arm” was evaluated separately. Both “arms” must be positive for the feedforward motifs to be positive. An “arm” that has all links positive or even number of negative links is considered positive; whereas, an arm with an odd number of negative links is considered negative. Subnetworks from source to target nodes: Series of subnetworks from specific source nodes (generally ligands) to specific targets (components within machines such as channels or transcription factors), with limited path lengths, were created using the following code: Code segment S15. The concept is illustrated in Movie S21. Shuffled networks, where only links that do not involved the source nodes and target nodes, were created for statistical control. In the control subnetworks, we preserved the biologically specified connectivity and directionality between ligands and their receptors and from the immediate upstream components to the AMPA channel and CREB. For all other connections we randomly swapped the direction of the connection while preserving the overall connectivity structure of the network. Curve fitting of the data, Spreadsheet S27, was done with ExcelTM (S4). Subnets based on nodal connectivity: Series of sub-networks created based on nodal connectivity were created using Code segment S16. The concept is illustrated in Movie S22. Search for feedback loops was implemented with Code segment S17, feed-forward loops with Code segment S18, bi-fan motifs with Code segment S19. The results are summarized in Spreadsheet S28 and were visualized with ExcelTM (S4). The number of islands does not consider highly connected nodes as single islands when they are not included. For example, PKA, a highly connected node (k=49) is not included in the subnetworks with lower connectivity threshold and is not considered an island by itself. Computing the Density of Information Processing (DIP): DIP is a measure of the local density of motifs and their interconnectedness within the interaction space of the network. DIP is defined as:

iii

iii GC

LLMM

DIP ⋅⎟⎟⎠

⎞⎜⎜⎝

⎛−−

=−

−

1

1 (1)

Where iiiiii BIFANFFLFFLFBLFBLM ++++= 4343

5

Mi is the total number of feedback loops, feedforward loops and bifan motifs. Li is the total links and i represent the step. FBL3 and FBL4 are feedback loops of size 3 and 4, FFL3 and FFL4 are feedforward loops of size 3 and 4 and BIFAN are bi-fan motifs of size 4. GC is the grid coefficient representing interconnectedness for the motifs, computed for the subnetwork at step i. Computing the Motif Location Index (MLI): MLI measure the concentration of specified motifs and various locations within the network. MLI can vary from 0 to 1 depending on its relative distance from the extracellular ligands to cellular machine, where 0 indicates location at the level of machines. MLI was calculated as follows:

MLI = n

CPLLCPLMCPLMn

i ii

i∑=

⎟⎟⎠

⎞⎜⎜⎝

⎛+1 (2)

where n is the size of the motif, CPLM is the characteristic path length from a node within the motif to all other nodes in the cellular machine and CPLL is the characteristic path length from a node to all extracellular ligands. If a node is an extracellular ligand then CPLL = 0; if the node is in the plasma membrane CPLL= 1. If a node belongs to a cellular machine, CPLM = 0. The average shortest path length was computed using Floyd’s algorithm (S5). Supporting text Functional Organization: From Components to Information Processing Motifs

Components within mammalian cells interact with one another to form local networks that together form a single large network. This organization is essential for cellular components to coordinate their individual activities and achieve the cohesiveness needed for cellular functions. Information needs to flow between components in a continuous and organized manner. Determining how this flow of information occurs is a crucial step in understanding the functional organization of mammalian cells. This system of interacting cellular components based on phenotypic behavior allows us to analyze the flow of information between the components to identify the emergence of regulatory motifs that are capable of processing information as it flows through the network.

Regulatory motifs such as feedback loops and feedforward motifs allow the cell to process information from extracellular signals, and decide when such information persists and when it is transient. The presence of positive feedback loops arising from coupled biochemical reactions leads to switching behavior that can enable state changes (S11). State change triggered by biochemical switches has been demonstrated in developmental systems (S12).

The overall profile of motifs may be an initial indicator of the cell’s information processing capability, when the quantitative and temporal characteristics of the interactions and consequent motif formation are taken into account. Such quantitative specification may lead to comparative profiles different from that observed here. Nevertheless, the identification and characterization of motifs allows us to move from components and interactions to the next level of organization within the cell.

Distinct regulatory motifs are active in response to signals from different ligands. The balance of the emergent positive and negative motifs may define the capability of the ligand to induce plasticity or maintain homeostasis. Comparison between motifs assumes that the motifs can be formed and function concurrently. The relationship between the interaction steps in this pseudodynamic analysis which considers propagation of functional connectivity over chemical space and time steps needs to be specified at the level of individual interactions. This is discussed in

6

greater detail below. The relationship between distances in chemical space in such a functional organization scheme and subcellular localization in terms of organelles and identified physical compartments require further analysis to identify where the two types of organization overlap and where they diverge.

At the functional level, each network configuration is likely to have its own distinct pattern of regulatory motifs, and thus define capability of the cell for either maintaining homeostasis or promoting state change such as that seen in synaptic plasticity. Would a large ensemble of such states make the system intractably complex, such that we would not be able to either understand or predict its behavior? This is unlikely because many of the states may have profiles of regulatory motifs that produce convergent effects on cellular machines. Such convergence would provide a molecular basis for redundancy of cellular function and endow the cellular system with robustness. Functionally, cellular network states may fall into two broad categories: those that preserve homeostasis during system perturbation and those that promote reorganization of the parts of the cellular network. Such reorganization may include the synthesis of new components such as transcription factors as well as other components that are required for expression of phenotypic behavior. The notion that multiple network states may converge to yield a limited set of phenotypic responses is supported by recent studies of networks of neurons where similar network behavior, as defined by firing patterns, can arise from multiple network configurations based on individual parameters that are quite different (S13). Perhaps such convergence represents a higher order of organized redundancy arising from the many feedforward motifs. Thus, it is likely that cellular systems are robust at varying levels of organization. From Qualitative Representation to Quantitative Behavior of Motifs Qualitative representations of cellular interaction such as those analyzed here do not take into account individual reaction rates that define individual links. This simplifying assumption leads to questions such as whether the identity and function of regulatory motifs would be preserved if the reaction reactions for the links that make up the motifs vary over many orders of magnitude. Experimental studies have shown that some of the links are relevant only at specific times, at different perturbation, and at different locations within the cell. Such conditionality can be manifested in many formats. Positive (also termed coherent) feedforward loops (FFL) can function in several ways. For example, they can be coincidence detectors (AND gates), provide redundant pathways to the same effectors (OR gates); they can lead to long sustained activity if one arm is fast and the other arm is slower (S14, S15). These functions depend on the reaction rates of the two arms. To obtain a preliminary idea of the potential validity of the motifs we have identified in this study, we gathered a set of representative reactions from various classes (i.e., ligand-protein, protein-protein, and enzymatic reactions) and evaluated their rates by comparing on and off rates and Kcat values. The maximum observable difference was about 1000 fold (Table S2). We then constructed a toy ODE model for motif 44 (Fig. S3). A schematic of the model is shown (Fig. S11A). We varied the difference in rates between the two arms by a factor of up to 1000. Although there are temporal differences in the output, the overall output profiles are quite similar within the same time scale (Fig. S11B). These initial simulations suggest that the integrity of the motif is persevered even with widely varying rates. Although this toy ODE model has few components, it should be noted that this is likely to be a limiting case since with multiple components in each arm, reactions rates are also likely to vary and thus reducing the difference between the two arms. This is especially true for situations at two-three steps from the ligand, the location where the motifs become more prevalent. Thus, it is likely that many of the motifs identified by qualitative analysis will be operative even when interaction rates vary and the concentrations of components change in a regulated manner. Definitive proof that

7

each of these motifs is functional will only come when each reaction within a motif is parameterized and input/output relationships are determined, and the existence of the motifs will be experimentally verified. Configuration of Motifs and choosing between homeostasis and plasticity: The conversion of short term biochemical changes into long-term physiological changes including state changes is often driven prolonged activation of key protein kinases and/or the activation or induction of transcription factors. The extent and duration of activity of these downstream components is regulated by the upstream signaling network. Since the network contains various types of regulatory motifs, the balance between these motifs will determine the extent and duration of the signal reaching these key components. Abundance of positive feed-forward or feedback loops would evoke the extended activation of these components, while the presence of negative feedback loops and gates could limit the duration of activation of the downstream components. Hence, the ratio of the positive to negative feedback and feed-forward loops can determine whether or not downstream components are activated for extended periods at sufficient levels. This extent and duration of activation of the downstream components like the protein kinases or transcription factor may in turn determine if the cell can maintain homeostatic behavior or undergo long-term physiological changes. Thus, the ratio of positive to negative feedback and feed-forward motifs could be a key indicator of whether, in the presence of external signals, the cell is configured to maintain homeostatic balance or undergo state change. Since the propagation of connectivity by different ligands involves different components and pathways, the ratio of the positive and negative motifs is a function of the types of signals the cell receives and hence is conditional. Comparison of the ratio of these motifs recruited per step for different ligands, glutamate, NE and BDNF clearly demonstrates that the regulatory configuration is defined by the ligand that initiates the signal propagation. Thus, by balancing the positive to negative motifs for different extracellular signals the cell may be able to tune itself to either respond or not respond to extracellular signals by undergoing long-term physiological changes. It is noteworthy that for both glutamate and BDNF in the pathways to CREB the positive and negative loops are evenly balanced through nine steps. Such balance of regulatory loops provides an explanation of how homeostasis within the cell can be achieved even when a perturbing signal propagates through the system. The relative abundance of the positive feedback and feedforward loops as compared to the negative loops in the sub-network for NE to AMPA channels and to CREB is also noteworthy. These ratios may indicate that this sub-network can hold and transfer information across time-scales leading to persistent changes. This configuration of motifs might provide a systems level explanation of why the cAMP pathway is associated with the late phase of long-term potentiation in the CA1 neuron (S16). It is also noteworthy that propagation of connectivity from BDNF to AMPA channels and CREB results in equal numbers of positive and negative feedback loops suggesting that BDNF would not play an important role in changing the state of this cell. This is in agreement with experimental observation that BDNF induces plasticity by affecting the functions of the presynaptic CA3 neuron (S17) although acute effects of BDNF are observed in the postsynaptic CA1 neuron (S18). Thus some of the characteristic responses evoked by important neurotransmitters may be understood by identifying the configurations of the regulatory motifs they evoke within the cell.

8

Supporting figures Fig. S1

Fig. S1 Visualization of the mammalian neuronal cellular network using the Pajek software The network is visualized by placing nodes as triangles based in their functional compartments. Size of triangles demonstrates the level of connectivity for the node. Green arrows represent activation links, red arrows inhibition links, and blue arrows neutral links. The network data was stored in the .net file format using custom code (Code segment S8). The output was then loaded into the Pajek program (S2) for visualization.

9

Fig. S2 A C

B Statistical characteristics of the fully connected network: Average links per node: 4.62 Characteristic Path Length: 4.21 (3.81 ± 0.026)* Grid Coefficient: 0.026 (0.005 ± 0.0005)* Clustering Coefficient: 0.11 (0.03 ± 0.004)* * Mean ± SD computed for 100 shuffled networks Fig. S2 Statistics of the fully connected network A) Distribution of interactions (links) into various categories. Interactions (links) in the cellular network were sorted using two different criteria: the biochemical reaction and the effect of the link. Upper pie chart shows sorting by biochemical reaction mechanism and the lower pie chart shows the distribution of connections that activate or inhibit downstream components. Neutral links indicate interactions where the directionality is unspecified. B) Overall network characteristics: In parentheses are the mean and standard deviation of the characteristic path-lengths, clustering-coefficients and grid-coefficients computed for 100 shuffled “null” networks with the same connectivity distribution as the cellular network. These shuffled networks were created using an approach based on Milo et al (S8). C) Log-log plot of the connectivity distribution. The number of links per node (k) is plotted against the number of nodes with the same level of connectivity.

10

Fig. S3

Fig. S3 Schematic representation of the motifs identified within the cellular network Network Motifs identified using the MFinder program. The figure visually represents the text based output, produced by the MFinder program. These motifs are the most statistically significant identified network motifs within the network. The MFinder program searches for motifs in directed networks, it does not distinguish between positive and negative links. Thus, for this analysis we considered positive and negative links as unidirectional and neutral links as bi-directional. The MFinder program was developed by Alon and coworkers (S9). The network motifs of size 5 and 6 were identified using the sampling methods. The various types of motifs are identified by numbers. The counts of the various types of motifs in the fully connected network are given in Table S1.

11

Table S1

Motifs counts Motif # CN* SN** Z-score

31 16 4.8 ± 2.8 3.98 32 22 9.3 ± 3.3 3.84 33 14 8.1 ± 2.6 2.30 34 36 12.5 ± 3.7 6.38 35 32 12.2 ± 3.8 5.16 36 25 9.8 ± 3.7 4.12 41 1011 186.6 ± 32.6 25.31 42 108 68.2 ± 14.8 2.69 43 26 7.2 ± 4.8 3.91 44 303 104.0 ± 15.5 12.88 45 57 17.8 ± 7.3 5.39 46 105 40.0 ± 10.9 5.97 47 49 31.5 ± 8.3 2.12

* CN- Cellular Network. ** SN- Shuffled networks.

Mean ± SD computed for 100 shuffled networks.

Using the sampling method (S8):

Motifs counts Motif # CN* SN** Z-score

51 0.254 0.023 ± 0.025 9.27 52 0.065 0.002 ± 0.003 11.42 53 0.662 0.005 ± 0.011 60.05 54 0.048 0.000 ± 0.002 26.57 55 0.100 0.003 ± 0.007 13.92 56 0.020 0.0 ± 0.0 NA 57 0.016 0.0 ± 0.0 NA 61 0.395 0.001 ± 0.011 36.01 62 0.189 0.0 ± 0.0 NA 63 0.050 0.0 ± 0.0 NA 64 0.096 0.0 ± 0.0 NA 65 0.070 0.0 ± 0.0 NA 66 0.059 0.0 ± 0.0 NA 67 0.049 0.0 ± 0.0 NA

Table S1 Counts of the various types of motifs found in the cellular network Motifs were counted using the MFinder program (S9). Counts for the motifs in the cellular network (CN) are compared to the mean for 100 control (shuffled) networks. Z-score indicates the statistical significance for the number found in the real network as compared to the mean of the control networks. Motifs of size 5 and 6 were estimated using the sampling method described by Kashtan et al. (S9). Estimated values are given as fractions (count of the specified motifs / the total number of motifs of the same size).

12

Fig. S4 A B

C D E

Fig. S4 Analysis of subnetworks from glutamate, NE and BDNF The number of links (A) and nodes (B) added per step, downstream from three ligands known to regulate plasticity in hippocampal neurons. C) Characteristic Path Length (CPL), D) Clustering Coefficient (CC) and E) Grid Coefficient (GC) computed for subnetworks emanating from glutamate, nor-epinephrine and BDNF. Fig. S5 A B C

Fig. S5 Comparison of positive vs. negative feed-forward loop motifs in subnetworks emanating in steps from the three ligands: glutamate, nor-epinephrine and BDNF Counts were obtained using the source code described under Methods. Positive feed-forward loop motifs are feed-forward loops where both “arms” to the target node are positive. If at least one arm is negative the feed-forward is considered negative. Positive “arm” is defined as an “arm” with either no negative links or even number of negative links.

13

Fig. S6

Fig. S6 Regulatory motifs identified in subnetworks emanating in steps from glutamate (Glu), norepinephrine (NE) and BDNF The motifs identified using our algorithms (see custom codes S17-S19) can classify motifs based on the different types of links and nodes. This is an enhancement to the MFinder program that only identifies more abstract motifs based on directionality only. A) The scaffold motif is made of three nodes connected solely by neutral links. A-I: A schematic illustration of a scaffold motif. A-II: The counts of scaffold motifs found in subnetworks generated from the three ligands A-III: Representative examples of the scaffold motifs. B) The bi-fan motif B-I: Schematic representation of the bifan motif. B-II: The counts of bi-fan motifs found in subnetworks generated from the three ligands. B-III: Representative examples of bifan motifs. C) The feedforward loop motifs are composed of three or four nodes connected with directed links only where a source node feeds into a target node through two alternative routes. C-I: Schematic representations of the feedforward loops

14

C-II: The counts of feed-forward loop motifs found in the subnetworks generated from the three ligands C-III: Representative examples of feed-forward motifs. Positive feed-forward loop motifs are feed-forward loops where both “arms” to the target node are positive. Positive “arm” is defined as an “arm” with either no negative links or even number of negative links. If at least one arm is negative the feed-forward is considered negative. Abbreviations: PKA: protein kinase A, PKC: protein kinase C, CaM: calmodulin, CaN: calcineurin, GlyR: glycine receptor, NMDAR: NMDA receptor, AC2: adenylyl cyclase 2, PLCβ: phospholipase C beta, CREB: cyclic-AMP response element binding protein. CREM: cyclic-AMP response element modulator. β-ARK: beta adrenergic receptor kinase. Fig. S7 A

B

Fig. S7 Size of subnetworks from extracellular ligands to specified effectors A) Schematic representation of the source and target nodes for the subnetworks starting from the three extracellular ligands: glutamate, nor-epinephrine (NE) and brain-drived neurotrophic factor (BDNF), (source nodes) to reach the effectors AMPA receptor-channels (AMPAR) or the transcription factor cyclic-AMP response element binding protein (CREB) (target nodes). B) The changes in the number of nodes as more steps are allowed to reach the target nodes (effectors) from the source ligands for six sets of subnetworks.

15

Fig S8 A

B C

Fig. S8 Feedback and feed-forward loops in subnetworks from extracellular ligands to CREB A) Schematic of the source and target nodes for the subnetworks starting from the three extracellular ligands: glutamate, nor-epinephrine (NE) and brain-derived neurotrophic factor (BDNF), (source nodes) to reach the effector CREB (target node). B-C) Counts of the number of total three and four component positive and negative feedback and feedforward loops, in subnetworks where more steps are gradually allowed between the source and target nodes.

16

Fig. S9 A

B C D

E F G

Fig. S9 Subnetworks characteristics from extracellular ligands to AMPAR A) Schematic of the source and target nodes for the subnetworks starting from the three extracellular ligands: glutamate, nor-epinephrine (NE) and brain-derived neurotrophic factor (BDNF), (source nodes) to reach the effector AMPA channel/receptor (AMPAR) (target node). B-D) Changes in the number of links as more steps are allowed to reach the effectors from each ligand are compared to the same analysis with shuffled networks. Only the directionality of the links that do not involve the ligands or the effectors was randomly swapped while preserving the connectivity. The resultant graphs for both the cellular network and shuffled networks were curve fitted. For all cellular subnetworks the best fit function was linear (R2 =0.99-0.96). Individual values are given as lin R2. For all of the shuffled networks the best fit was obtained for either exponential or power law functions (lin and pow respectively). E-G) Counts of the number of total three and four component positive and negative feedback and feedforward loops, in subnetworks where more steps are gradually allowed between the source and target nodes.

17

Fig. S10

Fig. S10 Changes in network size and statistical properties with gradual incorporation of nodes based on nodal connectivity. Sub-networks were created including only nodes with specified number of links per node (X-axis). A) The number of islands (isolated clusters of nodes that are not connected to one another), nodes, and links were counted in each subnetwork. B) Characteristic path-length (CPL) and clustering coefficients (CC), grid coefficients (GC) were calculated for each subnetwork.

18

Fig. S11 A

input

Activated A Activated B

output

NegativeRegulator

B

0 500 1000 15000.00

0.05

0.10

0.15 kcat A/kcat B = 1/1kcat A/kcat B = 10/1kcat A/kcat B = 10/0.1kcat A/kcat B = 10/0.01

t

Out

put

0 500 1000 15000.00

0.05

0.10

0.15

Km A/Km B = 1/1Km A/Km B = 10/1Km A/Km B = 10/0.1Km A/Km B = 10/0.01

t

Out

put

Fig. S11 Quantitative evaluation of a simple feedforward motif (motif 44 in Fig. S3). A) A feedforward motif (Motif 44, Fig S3) was constructed and simulated using a system of coupled ODEs. In this model, a simulated square pulse of input activates enzymes A and B. After 300 seconds, the input was “washed out,” and the behavior of the output, which is activated by both A and B, was studied. To balance the activities of the activator, a negative regulator was included. The initial concentration of all components was 1 µM. B) To study the motif behavior, sensitivity analyses were performed on kcat and Km. The ratio between the kinetics parameters of the two arms were varied from 10 (kcatA = 10 s-1, kcatB = 1 s-1), to 1000 (kcatA = 10 s-1, kcatB = 0.01 s-1). Similar analysis was performed on Km. The activity of the output component is plotted as a function of time. The analysis shows that the motif behavior is robust. Changing the kinetics parameters over 3 orders of magnitude had little effect on the qualitative behavior of the motif. As assessed by input-output relationships. These initial results suggests that components with widely varying reactions rates can come together to form functional units.

19

Table S2

Molec#1/Catalyst Molec#2/Substrate/Product Reaction type kf (/µM/s) kr (/s) KD/KM

(µM) kcat (/s) Ref. Notes

Rap1GAPII Rap1 GAP 1.4 80.5 65 11 (1)

cAMP PKA site1 Binding 1.67 0.0167 0.010 (2)

cAMP PKA site2 Binding 0.0093 0.00028 0.030 (2)

βARK β -AR Phosphorylation 0.2 0.2 (3)

Csk Src Phosphorylation 1.8 40 22 0.4 (4) 2-fold adjustment for T; KD & kr are lower limits

calcium N-term calmodulin Binding 6 500 (5)

calcium C-term calmodulin High Aff-Binding 6 10 (5)

calcium C-term calmodulin Low Aff-Binding 6 100 (5)

high and low-affinity binding @ C-term reflects

CaM complexing; dissociation but not

association rate constants were modified by CaM's

interactions MKP3 ERK2 Dephosphorylation 0.022 0.2 (6)

CDK1-cyclin Clb2 unidentified substrate/s 35 (ATP) 2.2 (7)

Used crude cell lysate as substrate. i.e. no single

target, but perhaps more physiological

PIP2 PKB (PH dom) Binding 0.5 (8)

RasGAP Ras GAP 4.8 9.12 (9)

Caspase 9 Caspase3 Cleavage 0.089 0.0168 (10)

Caspase 3 PARP Cleavage 5.6 0.213 (10)

Sos Ras GEF 0.50505 0.02 (11)

PKA Representative substrate Phosphorylation 2.4 2.7 (12)

PKA Representative substrate Phosphorylation 7.5 9 (13) CDK5 (p35

isoform) Tau Phosphorylation 33 0.0433 (14)

CDK5 (p25 isoform) Tau Phosphorylation 27 0.217 (14)

Calcineurin G-substrate, a substrate for PKG Dephosphorylation 3.8 0.41 (15)

Calcineurin DARPP-32, a substrate for PKA Dephosphorylation 1.6 0.2 (15)

Calcineurin synapsin I (site 1) Dephosphorylation 7 0.053 (15)

Calcineurin synapsin I (site 2) Dephosphorylation 4.4 0.04 (15)

Grb2 Sos Binding 0.025 0.0168 (16)

PKC (Ca bound) DAG Binding 0.008 8.6348 (16)

calmodulin MLCK (non phosphorylated) Binding 28 0.031 (17)

calmodulin MLCK (non phosphorylated) Binding 8 0.186 (17)

MEK MAPK Phosphorylation 0.476 0.113 (18, 19)

Km = 0.476 uM for 1st phosphorylation, 0.046

uM for second phosphorylation

Raf MEK1 Phosphorylation 0.8 0.105 (20)

GSK3β β-CATENIN Phosphorylation 0.1276 3.5 (21, 22)

1.7 6.13 (23)

STEP MAPK Dephosphorylation 38 12 (24, 25) AC 1

AC cAMP Synthesis 1.4 7.9 (26, 27)

PDE4D cAMP Cleavage

NWASP Arp2/3 Binding 0.003 (28)

Grb2 NWASP Binding 0.005 (28)

20

References for Table S2: 1. Kraemer, A., Brinkmann, T., Plettner, I., Goody, R. & Wittinghofer, A. (2002) J Mol Biol 324, 763-74. 2. Zorn, M., Fladmark, K. E., Ogreid, D., Jastorff, B., Doskeland, S. O. & Dostmann, W. R. (1995) FEBS Lett 362, 291-4. 3. Kim, C. M., Dion, S. B. & Benovic, J. L. (1993) J Biol Chem 268, 15412-8. 4. Lieser, S. A., Shindler, C., Aubol, B. E., Lee, S., Sun, G. & Adams, J. A. (2005) J Biol Chem 280, 7769-76. 5. Gaertner, T. R., Putkey, J. A. & Waxham, M. N. (2004) J Biol Chem 279, 39374-82. 6. Zhou, B., Wang, Z. X., Zhao, Y., Brautigan, D. L. & Zhang, Z. Y. (2002) J Biol Chem 277, 31818-25. 7. Ubersax, J. A., Woodbury, E. L., Quang, P. N., Paraz, M., Blethrow, J. D., Shah, K., Shokat, K. M. & Morgan, D. O. (2003)

Nature 425, 859-64. 8. Frech, M., Andjelkovic, M., Ingley, E., Reddy, K. K., Falck, J. R. & Hemmings, B. A. (1997) J Biol Chem 272, 8474-81. 9. Gideon, P., John, J., Frech, M., Lautwein, A., Clark, R., Scheffler, J. E. & Wittinghofer, A. (1992) Mol Cell Biol 12, 2050-6. 10. Bentele, M., Lavrik, I., Ulrich, M., Stosser, S., Heermann, D. W., Kalthoff, H., Krammer, P. H. & Eils, R. (2004) J Cell Biol 166,

839-51. 11. Orita, S., Kaibuchi, K., Kuroda, S., Shimizu, K., Nakanishi, H. & Takai, Y. (1993) J Biol Chem 268, 25542-6. 12. Hemmings, H. C., Jr., Nairn, A. C. & Greengard, P. (1984) J Biol Chem 259, 14491-7. 13. Bhalla, U. S. (2002) Biophys J 83, 740-52. 14. Hashiguchi, M., Saito, T., Hisanaga, S. & Hashiguchi, T. (2002) J Biol Chem 277, 44525-30. 15. King, M. M., Huang, C. Y., Chock, P. B., Nairn, A. C., Hemmings, H. C., Jr., Chan, K. F. & Greengard, P. (1984) J Biol Chem

259, 8080-3. 16. Bhalla, U. S., Ram, P. T. & Iyengar, R. (2002) Science 297, 1018-23. 17. Kasturi, R., Vasulka, C. & Johnson, J. D. (1993) J Biol Chem 268, 7958-64. 18. Seger, R., Ahn, N. G., Posada, J., Munar, E. S., Jensen, A. M., Cooper, J. A., Cobb, M. H. & Krebs, E. G. (1992) J Biol Chem

267, 14373-81. 19. Haystead, T. A., Dent, P., Wu, J., Haystead, C. M. & Sturgill, T. W. (1992) FEBS Lett 306, 17-22. 20. Force, T., Bonventre, J. V., Heidecker, G., Rapp, U., Avruch, J. & Kyriakis, J. M. (1994) Proc Natl Acad Sci U S A 91, 1270-4. 21. Ikeda, S., Kishida, S., Yamamoto, H., Murai, H., Koyama, S. & Kikuchi, A. (1998) Embo J 17, 1371-84. 22. Lee, E., Salic, A., Kruger, R., Heinrich, R. & Kirschner, M. W. (2003) PLoS Biol 1, E10. 23. Paul, S., Snyder, G. L., Yokakura, H., Picciotto, M. R., Nairn, A. C. & Lombroso, P. J. (2000) J Neurosci 20, 5630-8. 24. Tang, W. J., Stanzel, M. & Gilman, A. G. (1995) Biochemistry 34, 14563-72. 25. Taussig, R., Tang, W. J., Hepler, J. R. & Gilman, A. G. (1994) J Biol Chem 269, 6093-100. 26. Lim, J., Pahlke, G. & Conti, M. (1999) J Biol Chem 274, 19677-85. 27. Bolger, G. B., Erdogan, S., Jones, R. E., Loughney, K., Scotland, G., Hoffmann, R., Wilkinson, I., Farrell, C. & Houslay, M. D.

(1997) Biochem J 328 ( Pt 2), 539-48. 28. Carlier, M. F., Nioche, P., Broutin-L'Hermite, I., Boujemaa, R., Le Clainche, C., Egile, C., Garbay, C., Ducruix, A., Sansonetti,

P. & Pantaloni, D. (2000) J Biol Chem 275, 21946-52. Table S2 Reaction rates for representative reactions underlying interactions in the network and a listing of the primary references from which these rates were obtained Rates for several classes of the reactions underlying the links in the network were collected from published studies. The reactions for which rates are shown include: non-covalent binding including ligand-protein and protein-protein interactions, enzymatic reactions such as cAMP synthesis, phosphorylation, dephosphorylation, proteolytic cleavage, and changes in catalytic activity resulting from protein-protein interactions such as the GEF and GAP activity for GTPases. These types of reactions account for a large portion of the interactions within the cellular network.

21

Table S3

Table 3 Supporting external files The network database is provided in Text file S1 as a flat file where the field separator is a blank space. Each line in the file describes an interaction extracted from a specific journal. The tokens description is provided in the methods section under “Storage of the network data”. The Pajek software can be used to visualize the network with Text file S2 (S2). Text files S3-S6 are the raw output files from the MFinder program. Code segments S7-S19 contain algorithms written in C/C++ used to generate, analyze and visualize the subnetworks in the study. Movies S20-S23 illustrate the concept we termed pseudodynamics used to create the three different types of subnetworks. Spreadsheets S23-S28 contain the raw data from the analysis. These tables were used to create most of the graphs in the study. The supporting external files can be found also at http://amp.pharm.mssm.edu/network/sm/SM.htm and at http://www.mssm.edu/labs/iyengar. Other supporting online materials are available at: http://amp.pharm.mssm.edu/network/index.htm

File name Description Text file S1 Network data in flat file format. Text file S2 Pajek format text file describing the network data and assigns 2D location for nodes on the map. Text file S3 Raw output from the MFinder program searching for motifs of size 3. Text file S4 Raw output from the MFinder program searching for motifs of size 4. Text file S5 Raw output from the MFinder program searching for motifs of size 5. Text file S6 Raw output from the MFinder program searching for motifs of size 6. Code segment S7 Rules for sorting components into functional machines. Code segment S8 Converts the network into a Pajek format text file. Code segment S9 Creates sub-network where connectivity is propagated from ligands. Code segment S10 Analysis for nodes, interactions, motifs, clustering and path-lengths of sub-networks starting at

ligands. Code segment S11 Implementation of the Floyd algorithm to compute the characteristic path-length. Code segment S12 Implementation of the clustering coefficient algorithm. Code segment S13 Implementation of the grid coefficient algorithm. Code segment S14 Algorithm used to create 100 shuffled networks with the same connectivity distribution and one

island. Code segment S15 Code segment used to create sub-network from specified source node to specified target node. Code segment S16 Code used to create sub-networks with gradual allowable nodal connectivity. Code segment S17 Algorithm used to search for feedback loops. Code segment S18 Algorithm used to search for feed-forward loops. Code segment S19 Algorithm used to search for bi-fan motifs. Movie S20 Illustration of the breadth-first concept. Movie S21 Illustration of the sub-networks where the source and target are specified. Movie S22 Illustration of the concept of gradual inclusion of nodes based on connectivity. Spreadsheet S23 Connectivity propagation per step from all ligands. Spreadsheet S24 Number of nodes, interactions, motifs, computed clustering and path-lengths for sub-networks

starting at ligands. Spreadsheet S25 Connectivity distribution table. Spreadsheet S26 Characteristic path-lengths, clustering coefficients and grid coefficients for the real network and

for 100 shuffled networks with the same connectivity distribution. Spreadsheet S27 Results of the analysis of sub-networks from source to target. Spreadsheet S28 Analysis of sub-networks based on nodal connectivity.

22

Supporting References: S1. N. R. Gough, L. B. Ray, Science STKE. 135, EG8 (2002) S2. http://vlado.fmf.uni-lj.si/pub/networks/pajek/ (2004) S3. http://www.mathworks.com/products/matlab/ (2004) S4. http://office.microsoft.com/excel/ (2004) S5. T. H. Cormen, et al. 2002, Introduction to Algorithms, MIT Press Cambridge, MA. S6. D. J. Watts, S. H. Strogatz, Nature. 393, 440 (1998) S7. G. R. Caldarelli, Pastor-Satorras et al. European Physical Journal B. 38, 183 (2004) S8. R. Milo, S. Shen-Orr, S. Itzkovitz, et al., Science. 298, 824 (2002) S9. N. Kashtan, S. Itzkovitz, R. Milo, Bioinformatics. 20, 1746 (2004) S10. http://office.microsoft.com/visio (2004) S11. U.S. Bhalla, R. Iyengar, Science. 283, 381 (1999) S12. W. Xiong , J. E, Ferrell Jr. Nature. 426, 460 (2003) S13. A.A. Prinz AA, D. Bucher, E. Marder, Nature Neuroscience 7, 1345 (2004) S14. S. Mangan, U. Alon, Proc Natl Acad Sci U S A. 100, 11980 (2003) S15. S. Mangan, A. Zaslaver, U. Alon, J Mol Biol. 334, 197 (2003) S16 P. V. Nguyen, T. Abel, E. R. Kandel, Science. 265, 1104 (1994). S17. S. S. Zakharenko, S. L. Patterson, I. Dragatsis, et al., Neuron. 39, 975 (2003). S18. Y. Kovalchuk, E. Hanse, K. W. Kafitz, et al., Science. 295, 1729 (2002). Author Contributions AM assembled the large-scale network, wrote all of the custom-code and conducted all of the analysis described in this study. SLJ developed the database to manage and maintain the components links and references programs; AM, SN, AH, EG, BD-T, NJE, SLJ, RDB and RI found and read the original papers cited as references to verify the components and interactions. GW and PTR developed the original connections map of nearly 200 components on which the network used in this study was based upon. JJR, AK and GAS provided approaches to analyze and statistically validate several of the key findings. NJE performed the quantitative analysis of the feedforward motif. AH assembled the table of rate constants. RDB provided supervision of the development of interactions maps and its anchoring to the hippocampal neuron. RI was responsible for overall supervision of the project including analysis of the data and writing of the manuscript and is responsible for the final version of the submitted paper.

Documents

NIHMS266526 Supplement Supplemental Online Materials