Networks in Chemistry and Biology.pdf

7/30/2019 Networks in Chemistry and Biology.pdf

1/19

Networks in Chemistry and BiologyNetworks in Chemistry and Biology

CCC 401

Drugs and Natural Remedies


2/19

Network Topology ofNetwork Topology of

Protein Binding SitesProtein Binding Sites

Signatures of 108,089 binding sites from protein-ligandcomplexes in the Protein Data Bank were computed.

These signatures account for the distribution of polar

an non-po ar reg ons, as we as e ectrostat c potent a ,on the surface of the protein binding site.

Similar binding sites exhibit potential for binding similarligands, which makes these signatures useful for drugrepositioning.

Krein MP, Sukumar N. Exploration of the Topology of Chemical Spaces with Network Measures.J. Phys. Chem. A, 115(45), 12905-12918 (2011).


3/19

Network representations are widely employed in

systems biology, as the majority of gene products act

together with other gene products in vivo to generate

a complex network of interconnected components.

Biological NetworksBiological Networks

,genes, gene products, drugs, proteins, phenotypes,

metabolites or even terms in the scientific literature)

and the edges represent dependencies between

these variables (either inter-molecular, i.e. protein-

protein, protein-DNA or protein-ligand interactions, or

co-occurrence of phenotypes or terms).


4/19

Yeast proteinYeast protein--proteinprotein

interaction networkinteraction network

Each node is aprotein found in

yeast. Two nodes are

connected by an

edge if the two

proteins interact.


5/19

The Human Disease Network and the Disease Gene Network

Each node corresponds to a

distinct disease, coloured

based on its disease class.

The size of each node is

proportional to the number

of genes involved in the

disease. The link thickness is

proportional to the number

of shared genes between the

Goh K. et.al. PNAS 2007;104:8685-86902007 by National Academy of Sciences

diseases .

Here each node is a

gene; two genes are

connected if they are

implicated in the same

disease. The size of each

node is proportional to

the number of diseases

in which the gene is

implicated.


6/19

Degree k: The most elementary characteristic of a node, which

tells us how many links the node has to other nodes.

Degree distribution P(k): The probability that a selected nodehas exactly k links.

obtained by counting the number of nodes with k = 1,2...

Network measuresNetwork measures

links and dividing by the total number of nodes.allows us to distinguish between different classes of

networks.

Clustering coefficient: Ci = 2ni/k(k-1), where nI is the number oflinks connecting the k neighbors of node i to each other, i.e., Ci is

the number of triangles that go through node i.

C(k) is the average clustering coefficient of all nodes with k

links.


7/19

Random Scale-free

Random and ScaleRandom and Scale--free networksfree networks

A.-L. Barabsi, Linked: The New Science of Networks. Cambridge, MA: Plume Books, 2003.


8/19

The node degrees follow a Poisson distribution, indicating thatmost nodes have approximately the same number of links(close to the average degree).

Random networksRandom networks

P(k) ~ exp(-k), indicating that nodes that significantly deviatefrom the average are extremely rare.

The mean path length is proportional to the logarithm of the

network size, indicating small-world property.


9/19

Characterized by a power-law degree distribution: theprobability that a node has k links follows P(k) ~ k-.

The probability that a node is highly connected is statisticallymore significant than in a random graph, the network'sproperties often being determined by a relatively small numberof highly connected nodes (hubs).

ScaleScale--free networksfree networks

Such distributions are seen as a straight line on a loglog

plot.

Scale-free networks with degree exponents 2-3 (as in most

biological and non-biological networks) are ultra-small, withthe average path length following ~ log log N significantlyshorter than the log N that characterizes random small-worldnetworks.


10/19

Clusters combine in an iterative manner, generating a

hierarchical network and accounting for the coexistence ofmodularity, local clustering and scale-free topology.

The most important signature of hierarchical modularity is the

Hierarchical networksHierarchical networks

sca ing o t e c ustering coe icient, w ic o ows C ~ -1 astraight line of slope -1 on a loglog plot .

A hierarchical architecture implies that sparsely connectednodes are part of highly clustered areas, with communication

between the different highly clustered neighborhoods beingmaintained by a few hubs.


11/19

Random, ScaleRandom, Scale--freefree and Hierarchicaland Hierarchical networksnetworks

Degree distribution

Clustering coeff.


12/19

Disabling a substantial number of nodes in a random network results infunctional disintegration: if a critical fraction of nodes is removed, aphase transition occurs, breaking the network into tiny, non-communicating islands of nodes.

Scale-free networks do not have a critical threshold for disintegration they are robust against accidental failures: even if 80% of randomlyselected nodes fail, the remaining 20% still form a compact cluster witha ath connectin an two nodes.

Topological robustnessTopological robustness

This is because random failure affects mainly the numerous small degreenodes, the absence of which doesn't disrupt the network's integrity.

But this reliance on hubs induces vulnerability to targeted attack theremoval of a few key hubs splinters the system into small isolated nodeclusters.

Complex systems, from the cell to the Internet, can be amazinglyresilient against component failure, withstanding incapacitation ofmany individual components and many changes in external conditions.


13/19

ChemicalChemical space Networksspace Networks

What are the topological

characteristics of a

Chemical Space

Network?

Study of the topological properties of chemical spaces is

important for understanding the similarities between moleculesand the domain of applicability of predictive QSAR models.


14/19

Network Similarity Graphs for

6 classes of enzyme inhibitorsWawer, Peltason, Weskamp, Teckentrup and Bajorath,J. Med. Chem. 2008, 51, 60756084


15/19

Degree DistributionDegree Distribution ofof aa ChemicalChemical Space NetworkSpace Network

ZINC: a database of over 15 million

commercially-available compounds

for virtual screening:http://zinc.docking.org/

Molecules with distance

less than (i.e. similarity

greater than) a threshold

value are connected by an

edge of the network graph.

Krein MP, Sukumar N. Exploration of the

Topology of Chemical Spaces with

Network Measures.J. Phys. Chem. A,

115(45), 12905-12918 (2011).


16/19

Activity cliffsActivity cliffs

According to Gerry Maggiora, deviations from the similarity principle might be due to

the complex nature of the activity landscape associated with a biological assay.

This in turn is related to the chemical-space representation (molecular descriptor

space) used to characterize the molecules and to the similarity assessment metric

used.

ot a c em ca spaces

are created equal!

Thus very similar

molecules may in some

cases possess very

different activities, givingrise to activity cliffs.


17/19

StructureStructure--Activity Landscape IndexActivity Landscape Index

According to Gerry Maggiora, deviations from the similarity principle might be due to

the complex nature of the activity landscape associated with a biological assay.

This in turn is related to the chemical-space representation (molecular descriptor

space) used to characterize the molecules and to the similarity assessment metric

used.

Not all chemical spaces are created equal!

Thus very similar molecules may in some cases possess very different activities, giving

.

SALI quantifies the activity cliffs in chemical models of biological

activity:

SALIi,j = |Ai Aj|/{1 sim(i,j)}

Ai and Aj are activities, sim(i,j) is the similarity coefficient.

Steep activity cliffs in a data set lead to high SALI values these arethe most interesting regions of a structure-activity relationship for

drug design.


18/19

Global Network Topology ofGlobal Network Topology of PubchemPubchem

(SALI edges in red)(a) Bioassay 361 network graph as determined by pairwise comparisons of PubChem fingerprints at an 85% Tanimoto similarity threshold,in a Fruchterman-Reingold layout. Thick red lines represent SALI edges, chosen at a 95% cutoff of non-zero values. (b) is the network

comprised solely of those SALI edges.

Krein MP, Sukumar N. Exploration of the Topology of Chemical Spaces with Network Measures.J. Phys.Chem. A, 115(45), 12905-12918 (2011).


19/19

Local Network Topology ofLocal Network Topology of PubchemPubchem(SALI edges in red)

Krein MP, Sukumar N. Exploration of the Topology of Chemical Spaces with Network Measures.J. Phys.Chem. A, 115(45), 12905-12918 (2011).

Documents

Networks in Chemistry and Biology.pdf