11
IEEE/ACM TRANSACTIONS ON NETWORKING 1 On the Discovery of Critical Links and Nodes for Assessing Network Vulnerability Yilin Shen, Student Member, IEEE. Nam P. Nguyen, Ying Xuan, and My T. Thai, Member, IEEE. Abstract—The assessment of network vulnerability is of great importance in the presence of unexpected disruptive events or adversarial attacks targeting on critical network links and nodes. In this paper, we study Critical Link Disruptor (CLD) and Critical Node Disruptor (CND) optimization problems to identify critical links and nodes in a network whose removals maximally destroy the network’s functions. We provide a comprehensive complexity analysis of CLD and CND on general graphs and show that they still remain NP-complete even on unit disk graphs and power-law graphs. Furthermore, the CND problem is shown NP-hard to be approximated within Ω ( n-k n ) on general graphs with n vertices and k critical nodes. Despite the intractability of these problems, we propose HILPR, a novel LP-based rounding algorithm, for efficiently solving CLD and CND problems in a timely manner. The effectiveness of our solutions is validated on various synthetic and real-world networks. Index Terms—Network Vulnerability, Computational Com- plexity, Inapproximability, Experiments I. I NTRODUCTION T HE assessment of network vulnerability is to study how much the network performance reduces in various cases of undesired disruptions, such as natural disasters, unexpected elements failures, or especially adversarial attacks. In a typical attacking point of view, an attacker would first exploit the network weaknesses, and then only needs to target on some critical links or nodes whose corruptions bring the whole network down to its knees. For instance, an adversarial attack to any essential Internet providers, e.g., tier-1 ISPs such as Qwest, AT&T or Sprint servers, once successful, may cause tremendous breakdowns to millions of companies’ websites and online services. In a natural disaster, an unexpected earthquake may destroy some important power lines, and consequently lead to a large-area blackout. Therefore, in order to continuously maintain the normal network functions, it is important to explore the network vulnerability, i.e., identify those crucial links and nodes, beforehand. There have been many studies proposing different metrics to account for the network vulnerability [6], [7], [24], [25], among which the degree of suspected nodes or edges [7], the average shortest path length [6], the global clustering coefficients [24], and the available number of compromised s - t flows [25] appear to be the most popular and effective. Unfortunately, these measures do not seem to cast well for some particular kinds of network vulnerabilities, especially Yilin Shen, Nam P. Nguyen, Ying Xuan, and My T. Thai are with the Department of Computer and Information Science and Engineering, Uni- versity of Florida, Gainesville, FL 32611 USA (e-mail: [email protected]fl.edu; [email protected]fl.edu; [email protected]fl.edu; [email protected]fl.edu). when network fragmentation is of high priority, as depicted in Figure 1. Let us consider an example in Figure 1 illustrating a small portion of the Internet, where nodes v 1 ,v 2 ,...,v 7 are ISPs and the rest are consumers or transmission nodes. As revealed in this figure, any successful corruptive attacks to nodes v 8 and v 10 are sufficient to bring the whole network down to its knees with no satisfied customers. In a different attacking strategy, the removal of node v 7 or v 14 , if the adversary was to use maximum degree centrality as the metric, does not appear to harm the network function because all customers are still satisfied. These removals also reduce the global clustering coefficients to 0 and increase the average shortest path to nearly 3. Besides, if the attacker uses the available number of compromised v 1 - v 2 flows, the destructions of nodes v 4 and v 7 will drop the flow to 1, which still, unfortunately, cannot destroy the existence of the giant ISP component providing services to the (almost) whole network. This example illustrates an important point that the other metrics are lack of: In order to break down the network, we need to somehow control the balance among disconnected components while ensuring the nonexistence of giant compo- nents. One possible and effective way to do so is to measure the total pairwise connectivity, i.e., the number of connected node-pairs [13] in the network. Back to our example, a scrutiny look into the destructions of nodes v 8 and v 10 , which we already know can break down the network’s function, reveals that they, indeed, reduce the network total pairwise connectivity to its greatest extent (65%). This great reduction, as a result, significantly malfunctions the whole network. V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16 Fig. 1. The removal of v 8 and v 10 (white nodes) is sufficient to destroy the function of the whole network such that only 35% nodes connect each other. The total pairwise connectivity measure lends itself effec- tively in a lot of practical network applications. For instance, in many large-scale networks, which have been shown to be scale-free [22], the removal of critical links and nodes regarding this metric can not only reduce the network per- formance but also possibly disconnect those networks from the outside world. Another application of this metric can be

IEEE/ACM TRANSACTIONS ON NETWORKING 1 On the Discovery …mythai/files/ToN-CND.pdf · number of network elements (links or nodes) fail undesirably. We refer to these elements as critical

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: IEEE/ACM TRANSACTIONS ON NETWORKING 1 On the Discovery …mythai/files/ToN-CND.pdf · number of network elements (links or nodes) fail undesirably. We refer to these elements as critical

IEEE/ACM TRANSACTIONS ON NETWORKING 1

On the Discovery of Critical Links and Nodes forAssessing Network Vulnerability

Yilin Shen, Student Member, IEEE. Nam P. Nguyen, Ying Xuan, and My T. Thai, Member, IEEE.

Abstract—The assessment of network vulnerability is of greatimportance in the presence of unexpected disruptive events oradversarial attacks targeting on critical network links and nodes.In this paper, we study Critical Link Disruptor (CLD) and CriticalNode Disruptor (CND) optimization problems to identify criticallinks and nodes in a network whose removals maximally destroythe network’s functions. We provide a comprehensive complexityanalysis of CLD and CND on general graphs and show that theystill remain NP-complete even on unit disk graphs and power-lawgraphs. Furthermore, the CND problem is shown NP-hard to beapproximated within Ω

(n−knε

)on general graphs with n vertices

and k critical nodes. Despite the intractability of these problems,we propose HILPR, a novel LP-based rounding algorithm, forefficiently solving CLD and CND problems in a timely manner.The effectiveness of our solutions is validated on various syntheticand real-world networks.

Index Terms—Network Vulnerability, Computational Com-plexity, Inapproximability, Experiments

I. INTRODUCTION

THE assessment of network vulnerability is to study howmuch the network performance reduces in various cases

of undesired disruptions, such as natural disasters, unexpectedelements failures, or especially adversarial attacks. In a typicalattacking point of view, an attacker would first exploit thenetwork weaknesses, and then only needs to target on somecritical links or nodes whose corruptions bring the wholenetwork down to its knees. For instance, an adversarial attackto any essential Internet providers, e.g., tier-1 ISPs such asQwest, AT&T or Sprint servers, once successful, may causetremendous breakdowns to millions of companies’ websitesand online services. In a natural disaster, an unexpectedearthquake may destroy some important power lines, andconsequently lead to a large-area blackout. Therefore, in orderto continuously maintain the normal network functions, it isimportant to explore the network vulnerability, i.e., identifythose crucial links and nodes, beforehand.

There have been many studies proposing different metricsto account for the network vulnerability [6], [7], [24], [25],among which the degree of suspected nodes or edges [7],the average shortest path length [6], the global clusteringcoefficients [24], and the available number of compromiseds− t flows [25] appear to be the most popular and effective.Unfortunately, these measures do not seem to cast well forsome particular kinds of network vulnerabilities, especially

Yilin Shen, Nam P. Nguyen, Ying Xuan, and My T. Thai are with theDepartment of Computer and Information Science and Engineering, Uni-versity of Florida, Gainesville, FL 32611 USA (e-mail: [email protected];[email protected]; [email protected]; [email protected]).

when network fragmentation is of high priority, as depictedin Figure 1.

Let us consider an example in Figure 1 illustrating a smallportion of the Internet, where nodes v1, v2, . . . , v7 are ISPsand the rest are consumers or transmission nodes. As revealedin this figure, any successful corruptive attacks to nodes v8and v10 are sufficient to bring the whole network down toits knees with no satisfied customers. In a different attackingstrategy, the removal of node v7 or v14, if the adversary was touse maximum degree centrality as the metric, does not appearto harm the network function because all customers are stillsatisfied. These removals also reduce the global clusteringcoefficients to 0 and increase the average shortest path tonearly 3. Besides, if the attacker uses the available number ofcompromised v1 − v2 flows, the destructions of nodes v4 andv7 will drop the flow to 1, which still, unfortunately, cannotdestroy the existence of the giant ISP component providingservices to the (almost) whole network.

This example illustrates an important point that the othermetrics are lack of: In order to break down the network, weneed to somehow control the balance among disconnectedcomponents while ensuring the nonexistence of giant compo-nents. One possible and effective way to do so is to measurethe total pairwise connectivity, i.e., the number of connectednode-pairs [13] in the network. Back to our example, ascrutiny look into the destructions of nodes v8 and v10, whichwe already know can break down the network’s function,reveals that they, indeed, reduce the network total pairwiseconnectivity to its greatest extent (65%). This great reduction,as a result, significantly malfunctions the whole network.

V1

V2

V3 V4

V5

V6

V7

V8

V9

V10 V11

V12

V13

V14

V15

V16

Fig. 1. The removal of v8 and v10 (white nodes) is sufficient to destroy thefunction of the whole network such that only 35% nodes connect each other.

The total pairwise connectivity measure lends itself effec-tively in a lot of practical network applications. For instance,in many large-scale networks, which have been shown tobe scale-free [22], the removal of critical links and nodesregarding this metric can not only reduce the network per-formance but also possibly disconnect those networks fromthe outside world. Another application of this metric can be

Page 2: IEEE/ACM TRANSACTIONS ON NETWORKING 1 On the Discovery …mythai/files/ToN-CND.pdf · number of network elements (links or nodes) fail undesirably. We refer to these elements as critical

Y. SHEN et al.: ON THE DISCOVERY OF CRITICAL LINKS AND NODES FOR ASSESSING NETWORK VULNERABILITY 2

found in destroying terrorist networks, e.g., to breakdownthe communication between any two terrorist individuals tothe greatest extent, as well as protecting the functionality incommunication networks.

Motivated by the practical applicability of the total pair-wise connectivity metric, we study two practical optimizationproblems namely Critical Link and Node Disruptor (CLDand CND), to assess the network vulnerability when a givennumber of network elements (links or nodes) fail undesirably.We refer to these elements as critical links and nodes hereafter.Although the intractability of CND on general graphs hasbeen proven in [9], other aspects of its theoretical hardnessas well as the NP-hardness property of CLD are still ofinterest. However, it is not trivial to derive them with respectto different network structures or topologies such as UnitDisk Graphs (UDGs) and Power-Law Graphs (PLGs) whichhave been discovered as two of the most important modelsin wireless sensor networks and large-scale networks. In thispaper, we provide the following hardness results: (1) the NP-hardness of CLD on general graphs; (2) the intractabilityof CLD and CND even on UDGs and PLGs; and (3) theinapproximability result Ω

(n−knε

)of CND problem on general

graphs.Due to their NP-hardness, it seems unrealistic for one to

quickly obtain optimal solutions for CLD and CND problemswithin time constraint. To this end, we propose HILPR, aheuristic algorithm to effectively solve these problems in atimely manner. In a big picture, HILPR formulates CLD andCND under Integer Linear Programming (ILP) formulations,and then solves them using a novel iterative rounding tech-nique according to the LP solutions. In addition, HILPR alsotakes into account the local search and constraint pruningtechniques in order to further improve its efficiency andshorten its processing time.

In brief, our main contributions are:– The NP-completeness of CLD on general graphs, of CLD

and CND even on UDGs and PLGs.– The Ω

(n−knε

)inapproxibility hardness of CND problem

on general graphs.– HILPR algorithm, an LP-based approach for effectively

solving both CLD and CND problems with competitiveresults. The performance of our proposed algorithm isvalidated on a wide range of real-world and syntheticnetworks with different scales and topologies.

The paper is organized as follows. Related work is presentedin section II. In Section III, we introduce the network modeland problem definitions. Hardness results are provided inSection IV. Section V further shows the inapproximabilityof CND on general graphs. We propose a novel LP-basedrounding approach HILPR in Section VI and its experimentalevaluation in Section VII. At last, we conclude the whole paperin Section VIII.

II. RELATED WORK

Many existing works on network vulnerability assessmentmainly focus on the centrality measurements [10], includingdegree, betweenness and closeness centralities, average short-est path length [6], global clustering coefficients [24].

Due to the failures to assess the network vulnerability usingabove measurements, Sun et al. [30] first proposed the totalpairwise connectivity as an effective measurement and empir-ically evaluate the vulnerability of wireless multihop networksusing this metric. Arulselvan et al. [9] showed the challenge ofCND problem by proving its NP-completeness. Later on, theβ-disruptor problem was defined by Dinh et al. [13] to finda minimum set of links or nodes whose removal degrades thetotal pairwise connectivity to a desired degree. They provedthe NP-completeness of this problem with respect to both linksand nodes and the corresponding inapproximability results.Even for the tree topology, Di Summa et al. [29] found thatthe discovery of critical nodes also remains NP-complete usingthis metric. In this paper, we further investigate the theoreticalhardness of both CLD and CND on UDGs and PLGs.

In addition, there are a few effective solutions in theliterature of the network vulnerability assessment based on thepairwise connectivity. Arulselvan et al. [9] designed a heuristic(CNLS) to detect critical nodes, which is however still faraway from the optimal solution in large-scale and densenetworks. In [13], Dinh et al. proposed pseudo-approximationalgorithms to solve the β-disruptor problem. However, thisproblem is defined differently than ours and hard to use itssolution when we only know the available cost to destroy orprotect these critical links or nodes.

III. NETWORK MODELS AND PROBLEM DEFINITION

We abstract the network into a weighted undirected graphG = (V,E,W ) with a set of nodes V and a set of links E.The cost to destroy a link (i, j) and a node i is quantified aswij ∈ W and wi ∈ W for all |E| = m links and |V | = nnodes. Two nodes i and j in the network are connected iffthere exists at least one path from i to j.

In particular, when the communication network is an adhoc homogeneous wireless network, the graph is a UDG [11],where each node corresponds to a circle with the same radiusand a link appears between two nodes when the correspondingcircles intersect. Moreover, due to the discovery that manylarge-scale networks follow a power-law distribution in theirdegree sequences, ranging from biological networks, Internet,WWW to social networks [22], we further model the large-scale networks as PLGs, where the number of nodes havingdegree i is beα/iβc for some constant β and the number α isdetermined by network size.

As described in the introduction, with respect to the totalpairwise connectivity, we study the following Critical LinkDisruptor (CLD) and Critical Node Disruptor (CND) opti-mization problems.

Definition 1 (Critical Link Disruptor): Given an integer kand a weighted undirected graph G = (V,E,W ), the problemasks for a weight-bounded subset of critical links S ⊂ E, i.e.,∑

(i,j)∈S wij ≤ k, whose removal minimizes the total pairwiseconnectivity in G[E \ S].

Definition 2 (Critical Node Disruptor): Given an integer kand a weighted undirected graph G = (V,E,W ), the problemasks for a weight-bounded subset of critical nodes S ⊂ V , i.e.,∑vi∈S w

i ≤ k, whose removal minimizes the total pairwiseconnectivity in G[V \ S].

Page 3: IEEE/ACM TRANSACTIONS ON NETWORKING 1 On the Discovery …mythai/files/ToN-CND.pdf · number of network elements (links or nodes) fail undesirably. We refer to these elements as critical

Y. SHEN et al.: ON THE DISCOVERY OF CRITICAL LINKS AND NODES FOR ASSESSING NETWORK VULNERABILITY 3

IV. COMPUTATIONAL HARDNESS

In this section, we prove the NP-completeness of CLDon general graphs and further show the NP-completeness ofCLD and CND on unit disk graphs and power-law graphs tostrengthen its intractability.

A. General Graphs

The proof of CLD in general graphs is reduced from thep-Multiway Cut problem, which is defined as follows.

Definition 3 (p-Multiway Cut): Given an undirected graphG = (V,E) and a subset of terminals T ⊆ V with |T | = p,find a minimum set of links of G whose removal disconnectsall nodes in T . That is, each node in T belongs to oneconnected component.

Theorem 1: The CLD problem is NP-complete even if alllinks have unit weights.

Proof: Consider the decision version of CLD that askswhether an undirected graph G = (V,E) contains a set oflinks S ⊂ E of size k such that the pairwise connectivity inresidual graph G[E\S] is at most c for a given positive integerc. It is easy to see that CLD ∈ NP since a nondeterministicalgorithm only need to guess a subset S of links and checkin polynomial time whether S has the appropriate size k andthe pairwise connectivity of G[E \ S] is at most c by usingthe depth first search tree.

To prove that CLD is NP-hard, we reduce the 3-multiwaycut (3-MC) problem to it since p-MC has been proven tobe NP-hard for any p ≥ 3 by Dahlhaus et al. [12]. Let anundirected graph G = (V,E) where |V | = n, a set T ⊆ V ofthree terminals, and a positive integer k ≤ |E| be an arbitraryinstance of 3-MC. Now we construct the graph G′ = (V ′, E′)as follows. For each terminal v ∈ T , we add l = n2 morenodes onto v to construct a clique of size n2 +1, as illustratedin Fig. 2. We show that there is a 3-MC cut of size k in G iffG′ has a CLD S′ of size k such that the pairwise connectivityof G′[E′ \ S′] is at most 2

(n2+1

2

)+(n2+n−2

2

).

V4

V3V2 V6

V5

V1

(a) A graph instance G

V4

V3V2

Clique Kl

Edge Set El

V6

...V5

V1

(b) Reduced graph G′ from 3-MC

Fig. 2. An example of CLD reduction (Gray nodes are the subset of nodes tobe disconnected and grid components are cliques Kl (l = n2) which connectto each gray node)

First, suppose S ⊆ E is a 3-MC for G with |S| = k. Weneed to show that S is also a CLD of G′. According to theconstruction of G′, there is one node in each component ofG connecting a clique of size n2. It is easy to see that thepairwise connectivity is maximized in G when two connectedcomponents have size 2 and the other one has size n−2. Thatis, the total pairwise connectivity in G′ is upper bounded by2(n2+1

2

)+(n2+n−2

2

). Thus, the total pairwise connectivity of

G′[E′\S] is upper bounded by 2(n2+1

2

)+(n2+n−2

2

), implying

S is a CLD of G′.Conversely, suppose that S′ ⊆ E′ with |S′| = k ≤ |E|

is a CLD of G′, that is, the total pairwise connectivity ofG′[E′ \ S′] ≤ 2

(n2+1

2

)+(n2+n−2

2

). We show that S′ indeed

is also a 3-MC of G. Note that S′ ∩ (E′ \ E) = ∅, thatis, for any critical link e ∈ S′, e cannot be a link in theconstructed cliques since removing k links cannot disconnectthe cliques; otherwise the total pairwise connectivity is at least(2n2+2

2

)+(n2+n−2

2

), which is larger than 2

(n2+1

2

)+(n2+n−2

2

).

Therefore, S′ ⊆ E. Now, we further state that set S′ alsodisconnects all the three terminals. Assume that it is not, thenthe two cliques of these 2 terminals are connected. Thus,the pairwise connectivity is at least

(2n2+2

2

)+(n2+n−2

2

)>

2(n2+1

2

)+(n2+n−2

2

).

B. Unit Disk Graphs

The proof of CLD and CND in UDGs uses a reductionfrom the Planar Independent Set (PIS) with maximum degree3 and the Planar Vertex Cover (PVC) with maximum degree3 respectively. Given G = (V,E), the independent set (IS)problem asks us to find a subset S ⊆ V with the maximumsize such that no two vertices in S are adjacent whereas thevertex cover (VC) problem asks us to find a subset S ⊆ Vwith the minimum size such that every edge in G has at leastone endpoint in S. First, we present a fundamental lemma asfollows which will be used in both proofs.

Lemma 1 (Valiant [31]): A planar graph G with maximumdegree 4 can be embedded in the plane using O(|V |) area insuch a way that its nodes are at integer coordinates and itslinks are drawn so that they are made up of line segments ofthe form x = i or y = j, for integers i and j.

Theorem 2: The CLD problem is NP-complete on UDGseven if all nodes have unit weights.

Proof: Consider the decision version of CLD as definedin the proof of Theorem 1. To prove that CLD is NP-hardon UDGs, we reduce the planar independent set (PIS) withmaximum degree 3 to it. Let a planar graph G = (V,E)where |V | = n with maximum degree 3 and a positive integerk ≤ n be an arbitrary instance of PIS. We must construct inpolynomial time a UDG G′ = (V ′, E′) and positive integersk′ and c such that G has an independent set of size kiff G′ = (V ′, E′) has a CLD of size k′ and the pairwiseconnectivity of G′ after removing this CLD is at most c.

1

2

3

4

(a) An instance G

2

1

4

3

(b) Reduced graph G′ from PIS

Fig. 3. An example of CLD reduction

Our construction of G′ is as follows. Given a planar graphG, we first embed it into the plane according to Lemma 1. For

Page 4: IEEE/ACM TRANSACTIONS ON NETWORKING 1 On the Discovery …mythai/files/ToN-CND.pdf · number of network elements (links or nodes) fail undesirably. We refer to these elements as critical

Y. SHEN et al.: ON THE DISCOVERY OF CRITICAL LINKS AND NODES FOR ASSESSING NETWORK VULNERABILITY 4

each node v ∈ V , we convert it to a triangle Tv = v1, v2, v3(Grey triangles in Fig. 3). Each link in Tv is set to be a unitlength r =

√33 . Since each node in G has maximum degree

3, we attach the neighbors of v to v1, v2 and v3. If the degreeis less than 3, we only need to arbitrarily choose one or twofrom them to attach. Afterwards, for each embedded link e,we divide it into |e| “pieces” according to its length |e| andreplace each piece by a concatenation of 2 triangles as shownin Fig. 3. Thus, G′ is composed of 2

∑u,v∈V |euv| K3 cliques

where |euv| is the length of links between node u and v afterembedded onto the plane. Note that G′ is a UDG with radiusr =

√33 . Finally, we set k′ = 2

∑u,v∈V |euv|+ 3(n− k) and

c = 3(∑u,v∈V |euv|+ k).

First, suppose S ∈ V is an IS of G with |S| = k. For eachpiece in all links of G, we remove two links from 2 trianglessuch that each piece has one triangle left. After doing this, alltriangles Tv corresponding to node v ∈ S will be disconnectedfrom other nodes in G′. For each node v 6∈ S, we remove alllinks in Tv . Then the pairwise connectivity in G′ is c afterremoving these k′ critical links.

Conversely, suppose that S′ ⊂ V ′ with |S′| = k′ is a CLDof G′, that is, the pairwise connectivity of G′[V ′ \ S′] is atmost c. It is easy to see that the pairwise connectivity willincrease by

∑u,v∈V |euv|/2 if two triangles corresponding

to every other pieces are connected since there is one moreconnected node-pair. Thus, either one of two triangles foreach piece in G′ will be disconnected as long as no any twotriangles are connected. That is, two links will be removedfrom each piece. Apart from this, in order to reduce thepairwise connectivity to c by removing 3(n−k) more links, wehave to avoid the connectivity between the nodes in differenttriangles. For each triangle Tv , we remove either all three linksor none of any links since the removal of links in a connectedcomponent does not help to reduce the pairwise connectivityif it cannot disconnect the component. In this process, theexchange between some links belonging to K2 and the nodesin other triangles can be finished within polynomial time. Thisimplies the Tv with all their remained links is an independentset in original graph G and the number of Tv is k.

Before showing the NP-hardness of CND on UDGs, we firstprove the following useful lemma.

Lemma 2: For any planar graph G = (V,E) with maximumdegree 3, it can be embedded into a UDG with the radius 1/2such that there exists an independent-space Γ(v) for each nodev ∈ V satisfying Γ(v) 6= ∅ and any new nodes in Γ(v) willbe incident to v and independent from all other nodes. (SeeAppendix A for the proof.)

Theorem 3: The CND problem is NP-complete on UDGseven if all nodes have unit weights.

Proof: Consider the decision version of CND that askswhether a UDG G = (V,E) contains a set of nodes S ⊂ Vof size k such that the pairwise connectivity in G[V \S] is atmost c for a given positive integer c. Clearly, CND ∈ NP onUDGs.

Now, to prove that CND on UDG is NP-hard, we reducethe planar vertex cover (PVC) with maximum degree 3 toit. Let a planar graph G = (V,E) with maximum degree 3where |V | = n and a positive integer k ≤ n be an arbitrary

instance of PVC. We must construct in polynomial time aUDG G′ = (V ′, E′) and positive integers k′ and c such thatG has a vertex cover of size at most k iff G′ = (V ′, E′) hasa CND of size k′ such that the pairwise connectivity of G′

after removing this CND is at most c.

W113 W2

13 W313 W4

13

W123

W223

W323W4

23W623

W212

W112

W523

(a) Unit disk graph Gu

W113 W2

13 W313 W4

13

W123

W223

W323W4

23W623

W212

W112

W523

(b) Reduced graph G′ from PVC

Fig. 4. An example of CND reduction on UDGs

Our construction is as follows. Given G as a clique of size3 with v1, v2 and v3. First we construct a Gu in Fig. 4(a)according to Lemma 2. On Gu, for each independent-spaceΓ(vi), we add a new gray node ui such that there is a linkonly between vi and ui to obtain G′ as shown in Fig. 4(b).Finally, set k′ = k +

∑e∈E(G) pe and c = n− k.

First, suppose S ⊆ V is a VC of G with |S| = k. It is easyto verify that G has a VC S with size at most k iff Gu has a VCSu with size at most k+

∑e∈E(G) pe. That is, Su consists of S

and a half of nodes wi on each path (vi, w1, . . . , w2pvi,vj, vj).

We show that Su is also a CND of G′. After removing Sufrom G′, we only have all disjoint links (vi, ui) left wherevi /∈ S. Thus, the pairwise connectivity is n − k, which isequal to c.

Conversely, suppose that S′ ⊂ V ′ with |S′| = k′ is a CNDof G′, that is, the pairwise connectivity of G′[V ′ \ S′] is atmost c. If ui ∈ S′, that is S′ contain a gray node, replacing uiwith any vi or wi (black or white nodes) will further decreasethe pairwise connectivity. Thus, we can assume that S′ doesnot contain any gray node. Since |S′| = k +

∑e∈E(G) pe, we

can easily modify S′ to be a vertex cover of Gu. In this case,the pairwise connectivity is at most c = n− k. Thus, S′ ∩ Vis a VC of G.

C. Power-Law Graphs

As power-law networks have been shown to be vulnerableto attacks [8], [28], one will question whether or not thepower-law networks can fragment the network by simplyattacking the hub nodes. Additionally, it is believed that manyintractability optimization problems in general graphs may besolvable in PLGs [14], [16], [27]. Unfortunately, we furtherprove that both CLD and CND still remain NP-complete inPLGs. We begin with the following fundamental lemmas.

Lemma 3 (Ferrante et al. [15]): Let G1 = (V1, E1) bea simple graph with n nodes and β ≥ 1. For α ≥max4β, β log n + log(n + 1), we can construct a power-law graph G = G1 ∪ G2 with exponential factor β and thenumber of nodes eαζ(β) by constructing a bipartite G2 as amaximal component in G.

Page 5: IEEE/ACM TRANSACTIONS ON NETWORKING 1 On the Discovery …mythai/files/ToN-CND.pdf · number of network elements (links or nodes) fail undesirably. We refer to these elements as critical

Y. SHEN et al.: ON THE DISCOVERY OF CRITICAL LINKS AND NODES FOR ASSESSING NETWORK VULNERABILITY 5

Lemma 4: The clique separator (CS) problem (which isdefined as given an undirected graph G = (V,E), finda minimum set of links S ⊆ E such that the connectedcomponents of G[E \ S] are cliques, each has size at least3) is NP-hard. (See Appendix B for the proof.)

Theorem 4: The CLD problem is NP-complete on power-law graphs even if all nodes have unit weights.

Proof: Consider the decision version of CLD definedbefore. To prove that CLD on power-law graphs is in NP-hard,we reduce the clique separator (CS) to it. After constructinga power-law graph G′ = G ∪ Gb where the bipartite graphGb = (Ub, Vb;Eb) is a maximal component in G′ accordingto Lemma 3, we show that there is a CS of size k in G iff G′

has a CLD S′ of size k′ such that the pairwise connectivityof G′[E′ \ S′] is at most c, where k′ = k + |Eb| − |Mb| andc = |E|−k+ |Mb|. Note that Mb is the links in the maximummatching of Gb.

First, suppose S ⊆ V is a clique separator of G with |S| =k. We have the pairwise connectivity in G[E \S] to be |E| −k since all components in this graph are cliques. Since themaximum matching on Gb can be found in polynomial timeusing Hopcroft-Karp algorithm [20], the pairwise connectivityon G′ is c after removing additional |Eb| − |Mb|.

Conversely, suppose that S′ ⊂ V ′ is a CLD of G′ withsize k′. Note that S′ = A ∪ Sb, where A and Sb are CLD onG and Gb respectively. We show that the number of criticallinks Sb in Gb is |Eb| − |Mb|. If |Sb| < |Eb| − |Mb|, thepairwise connectivity of Gb increases by at least two whenadding one more link onto the maximum matching. On theother hand, the removal of l links on G can reduce the pairwiseconnectivity at most l after removing the CS k. If |Sb| >|Eb| − |Mb|, the pairwise connectivity of Gb reduce by onewhen removing one more link from the maximum matching.Meanwhile, a link added onto the residual graph of G willincrease the pairwise connectivity at least one if it connectstwo independent nodes and at least 3 if it has one endpointbelonging to some component in the residual graph of G. Thus,we have Sb = |Eb| − |Mb| and it is easy to verify that A is aCS of G.

Theorem 5: The CND problem is NP-complete on power-law graphs even if all nodes have unit weights.

Proof: Consider the decision version of CND as definedbefore. To prove that CND on power-law graphs is in NP-hard, we reduce the vertex cover (VC) to it. Let an undirectedgraph G = (V,E) where |V | = n and a positive integerk ≤ n be any instance of VC. We construct a power-lawgraph G′ = (V ′, E′) as follows. First, for each node vi ∈ Von graph G, we add one additional node ui onto it, whichwe call G1 and V1 = V ∪ U where U = ui. Thenaccording to Lemma 3, a power-law graph G′ = (V ′, E′)can be constructed by embedding G1 and a bipartite graphG2 = (V 1

2 , V22 ;E2) where V 1

2 , V22 are two sets of disjoint

nodes in G2 and α ≥ max4β, β log(2n)+ log(2n+1) withsome specific β. Note that V 1

2 and V 22 are marked gray and

white separately as shown in Fig. 5. We show that there is aVC of size k in G iff G′ has a CND S′ of size k′ such thatthe pairwise connectivity of G′[V ′ \ S′] is at most c, wherek′ = k + min|V 1

2 |, |V 22 | and c = n− k.

…...

12V 2

2V

2G

Vi

V i

Vi

Vi

Vi

G

Vi

Vi

Ui

V i V i

ViV

i

ViV

i

ViV

i

1G

ViV

i

(a) An instance

Vi

V i

Vi

Vi

Vi

G

Vi

Vi

Ui

V i V i

ViV

i

ViV

i

ViV

i

1G

ViV

i

…...

12V 2

2V

2G

(b) Reduced graph G′ = G1 ∪G2

Fig. 5. An example of CND reduction on PLGs. For simplicity, we justdraw the nodes in G and its newly added links and nodes.

First, suppose S ∈ V is a vertex cover of G with |S| = k.Therefore, G has a vertex cover S of size k iff G ∪ G2

has a vertex cover S′ of size k + min|V 12 |, |V 2

2 | since VCis polynomially solvable in any bipartite graphs accordingto Konig’s Theorem [21]. Then after removing S′ from G′,we only have all disjoint links (vi, ui) left where vi 6∈ S.Therefore, the pairwise connectivity on power-law graph G′

is n− k, which is equal to c.Conversely, suppose that S′ ⊂ V ′ with |S′| = k′ is a CND

of G′, that is, the total pairwise connectivity of G′[V ′ \ S′] isat most c. First, if ui ∈ S′, it is easy to verify that replacingui with any vi will further decrease the pairwise connectivity.Since |S′| = k′ = k+ min|V 1

2 |, |V 22 |, we can easily modify

S′ to be a vertex cover of G ∪ G2, where the total pairwiseconnectivity on G′ is at most c = n − k. Thus, S′ ∩ V is aVC of G.

V. INAPPROXIMABILITY OF THE CND PROBLEM

As the CND problem is NP-complete, one will questionhow tightly we can approximate the solution, leading tothe theory of inapproximability. The inapproximability factorgives us the lower bound of near-optimal solution with the-oretical performance guarantee. That said, no-one can designan approximation algorithm with a better ratio than the in-approximability factor. In this section, we further investigatethe inapproximability of CND on general graphs by showingthe gap-preserving reduction from maximum clique problem,which is defined as follows.

Definition 4 (Maximum Clique): Given an undirectedgraph G = (V,E), find a clique of maximum size. Asubgraph of G is called a clique if all its nodes are pairwiseadjacent.

Theorem 6: Assuming P 6= NP , the CND problem is NP-hard to be approximated within Ω

(n−knε

)for any ε < 1−logn 2

on general graphs even if all nodes have unit weights.Proof: In the proof, we show a gap-preserving reduction

from the MC problem to CND. According to Hastad [18],Maximum Clique (MC) is proven NP-hard to be approximatedwithin n1−ε. Let an undirected graph G = (V,E) and a posi-tive integer h ≤ n be an MC instance. We construct the graphG′ = (V ′, E′) as follows. We first construct a complementgraph G = (V,E) where E = (vi, vj)|(vi, vj) 6∈ E. Foreach node v in G, we add l ≤ n1−ε−1 more nodes to constructa clique of size l+1 to obtain G′, as shown in Fig. 6. Let φ bea feasible solution of MC in G and ϕ be a feasible solution of

Page 6: IEEE/ACM TRANSACTIONS ON NETWORKING 1 On the Discovery …mythai/files/ToN-CND.pdf · number of network elements (links or nodes) fail undesirably. We refer to these elements as critical

Y. SHEN et al.: ON THE DISCOVERY OF CRITICAL LINKS AND NODES FOR ASSESSING NETWORK VULNERABILITY 6

CND in G′ where |ϕ| = n−h. Define c = (n−h)(l2

)+h(l+12

),

we show the completeness and soundness respectively.

2

4

3

1

5

(a) A Graph Instance

2

Clique Kl

Edge Set El

......

...

... 4

3

1

5

(b) Reduced Graph G′ from MC

Fig. 6. An example of gap-preserving reduction from MC to CND

Completeness: We need to show that if OPT (φ) = h ⇒OPT (ϕ) = c

Let OPT (φ) = h be MC on graph G, it is easy to seeOPT (φ) is the maximum independent set on G. Since theminimum vertex cover and the maximum independent setare complementary on the same graph, G can be completelyfragmented by removing the set of nodes V \ OPT (φ), i.e.,only isolated nodes exist in G. Thus, the pairwise connectivityin G′ after removing n−h critical nodes is (n−h)

(l2

)+h(l+12

),

which is equal to c.Soundness: We need to show that if OPT (φ) < 1

n1−εh⇒OPT (ϕ) > Ω

(n−knε

)c

First of all, since OPT (φ) < 1n1−εh⇒ OPT (ϕ), we can-

not remove as many as n− 1n1−εh critical nodes for any ε < 1.

Therefore, it is impossible to reduce the pairwise connectivityin G′ to (n− 1

n1−εh)(l2

)+ 1n1−εh

(l+12

)by fragmenting G′ into

n components.For any ε < 1− logn 2, it is easy to verify that 1− 1

n1−ε >1

n1−ε . In this case, we show that the pairwise connectivity in G′

after removing n− h critical nodes will be lower bounded byΩ(n−knε

)c, with k = n−h. Note that the pairwise connectivity

is minimized when the number of connected components ismaximized and each connected component has the same sizeas shown in Fig. 7. That is, since 1− 1

n1−ε >1

n1−ε , the lowerbound is achieved when the same number of cliques CG1

connect to one CG2, where CG1 and CG2 denote the groupof cliques of size l and cliques of size l + 1 respectively,supposing that n− 1

n1−εh nodes are removed. Thus, we havethe following lower bound

(⌊ 1− 1

n1−ε1

n1−ε+ 1

⌋(l + 1)

2

)1

n1−εh

≥(

(n1−ε − 1)(l + 1)

2

)1

n1−εh

=

((n1−ε−1)(l+1)

2

)1

n1−εh

(n− h)(l2

)+ h(l+12

) c ≥ (n1−εl2

)1

n1−εh

n(l2

)+ lh

c

>

(n1−εl

2

)1

n1−εh

n(l+12

) c = Ω

(h

)c = Ω

(n− knε

)c

Step 2 holds since bzc ≥ z − 1 for any real number z; Steps4 and 5 hold since we have l ≤ n1−ε − 1 and h ≤ n.

Remarks: Note that when k is large, i.e., larger than thesize of vertex cover in G, there is no finite ratio approximation

1-ε

1-ε

Fig. 7. The lower bound instance of CND on G′ provided that |ϕ| = n−hand OPT (φ) < 1

n1−ε h

algorithm since the pairwise connectivity in the residual graphafter removing optimal critical nodes is 0.

VI. PROPOSED ALGORITHMS

Apart from the above theoretical hardness results for CLDand CND, these two problems are usually even harder to beapproximated. The pairwise connectivity can either remainO(n2) for CLD in dense networks even when k is large, orreach 0 for CND when k is larger than the size of vertexcover. In this section, we present our solution, a HybridIterative Linear Programming Rounding (HILPR) algorithmto both CLD and CND problems. In a big picture, HILPRformulates CLD and CND under Integer Linear Programming(ILP) formulations, and then solves them using an iterativerounding technique. In addition, HILPR also takes into accountthe local search and constraint pruning techniques in order tofurther improve its efficiency and reduce its time complexity.

A. Integer Linear Programming Formulation1) Critical Link Disruptor: For each pair of nodes i, j ∈ V ,

we define an indicator variable uij as:

uij =

1, if i and j are connected0, otherwise

Then we have the following ILP:

min∑i,j∈V

uij

s.t. uij + ujh − uhi ≤ 1 ∀i, j, h ∈ V∑(i,j)∈E

wij(1− uij) ≤ k

uij ∈ 0, 1

(1)

where the objective is to minimize the total pairwise connec-tivity. The first constraint imposes the triangular connectivity.That is, if node i and j are connected, node j and h areconnected, node i and h have to be connected. The secondconstraint means that the total weight of all deleted links hasto be at most k. We note that for an edge (i, j) ∈ E, if uij = 0in the ILP solution, then that link (i, j) is a critical link.

2) Critical Node Disruptor: For CND, we simply extendthe above IP formulation for CLD in (1) as

min∑i,j∈V

uij

s.t. vi + vj + uij ≥ 1 ∀(i, j) ∈ Euij + ujh − uhi ≤ 1 ∀i, j, h ∈ V∑i∈V

wivi ≤ k

vi ∈ 0, 1, uij ∈ 0, 1

(2)

Page 7: IEEE/ACM TRANSACTIONS ON NETWORKING 1 On the Discovery …mythai/files/ToN-CND.pdf · number of network elements (links or nodes) fail undesirably. We refer to these elements as critical

Y. SHEN et al.: ON THE DISCOVERY OF CRITICAL LINKS AND NODES FOR ASSESSING NETWORK VULNERABILITY 7

where vi is further defined as

vi =

1, if node i is deleted (i.e., critical nodes)0, otherwise

The first additional constraint guarantees that at least oneendpoint of a link has to be deleted if its two endpoints aredisconnected in the optimal solution. Other constraints arecarried out as CLD. We further constrain k in CND to satisfyLemma 5 for unweighted graphs to avoid the zero pairwiseconnectivity, that is, all nodes in network are independent.

Lemma 5: For an unweighted graph G, the optimal pair-wise connectivity of CND is larger than 0 if k < |E|/∆,where ∆ denotes the maximum degree in G.

Proof: We prove this using the contradiction method.Assume the optimal pairwise connectivity is 0, that is∑i,j∈V uij = 0, we have uij = 0 for any single link (i, j).

Therefore, vi + vj ≥ 1 according to the first constraint inLP (2). Hence we have

∑i∈V divi ≥ |E| by adding this

up for all links. Note that we assumed k < |E|/∆, so∑i∈V divi ≤ ∆

∑i∈V vi ≤ k∆ < |E|, which draws a

contradiction.

B. Hybrid Iterative LP Rounding Algorithm

The basic idea of our HILPR algorithm consists of threemain steps: (1) Relaxing the integral constraints of the aboveILP to obtain the corresponding LP; (2) Iteratively solvingthe LP by replacing k in it with some experimental parameterγ < k and rounding the corresponding fractional solutions ofwhich have weights at most γ to integers; and (3) Performingthe local search to further optimize the solutions. The detaileddescription is shown in Algorithm 1.

Specifically, in each iteration, we solve the LP after settingk = γ. Let u∗ij and v∗i be the optimal (fractional) solutionsafter solving the LP for the CLD and CND problems respec-tively. In the CLD problem, we round X smallest variablesu∗ij to 0 such that

∑uij∈X wij ≤ γ. Likewise, we round

the Y largest variables v∗i into 1 for the CND problem suchthat

∑vi∈Y w

i ≤ γ. In the next iteration, the graph will firstbe updated according to the previous rounding results, i.e.,the identified critical links or nodes will be removed fromthe graph. Then LP will be reformulated according the newresidual graph. The algorithm terminates when the total weightof all identified critical links (or nodes) reaches to k.

In the end, the HILPR algorithm further deploys a meta-heuristic approach [17] to enhance the solution S obtained bythe iterative rounding step mentioned above. The detail of thislocal search is shown in the last part of Algorithm 1. Take aCLD as an example. For each link e in the solution S, we dothe local swapping between e and each e′ ∈ N(e) to obtain thenew solution S′ = S∪e′\e. Let f(G,S) be the pairwiseconnectivity function of G[E \S]. If f(G,S′) < f(G,S), wereplace e by e′ in the solution, set S to be S′ and recursivelydo the local search on e′. The recursive procedure stops untilno more improvement can be achieved by the local search,i.e., f(G,S′) ≥ f(G,S). The whole algorithm stops until alllinks in S are visited. Similarly, the local search on CND canbe achieved by recursively checking the neighbor nodes.

Input : Graph G = (V,E), an integer k, γOutput: The set of critical links/nodes S

1 S ← ∅;// Iterative LP Rounding

2 while k > 0 do3 if k < γ then4 γ = k;5 end6 else7 Use Constraint Pruning to solve the LP formulation with γ;8 k ← k − γ;9 end

10 S′ ← X links with smallest u∗ij such that∑uij∈X

wij ≤ γ (in

case of CND, S′ ← Y nodes with largest v∗i such that∑vi∈Y

wi ≤ γ );11 S ← S ∪ S′;12 G← G[E \ S′];13 end

// Local Search14 S∗ ← S;15 foreach element e ∈ S do16 Swapping(e) (Algorithm 2);17 end18 S ← S∗;19 return S;

Algorithm 1: HILPR for a given γ

1 S∗ ← S∗ \ e;2 if ∃e′ ∈ N(e) such that f(G,S′) < f(G,S) whereS′ ← S \ e ∪ e′ then

3 S∗ ← S′;4 Swapping(e′);5 end

Algorithm 2: Swapping(e)

We note that the LP formula of CLD and CND each hasO(n3) constraints owing to the triangle inequality constraints.To improve the running time of our algorithm during solvingthe LP, we further propose the constraint pruning technique toeliminate the inactive constraints according to the followinglemma. As a result, the number of active constraints on triangleinequality in equation (1) and (2) can be reduced to O(n2)according to the constraint pruning technique.

Vi

V i

Vi

Vi

Vi

G

Vi

Vi

Ui

V i V iV

iVi

ViV

i

ViV

i

1G

ViV

i

…...

12V 2

2V

2G

Vi

Vj

Vk

Vh

Fig. 8. Triangle inequality constraints

Consider a four-tuple triangle inequality (i, j, k), (i, j, h),(i, k, h) and (j, k, h), all constraints are satisfied at the be-ginning. That is, as shown in Fig. 8, uij + ujh − uhi ≤ 1and ujk + uhk − ujh ≤ 1 for the tuple (i, j, h) and (j, k, h)respectively. In the case that the triangle inequality of thetuple (i, j, k) is tight, shown as shadow in Fig. 8, that is,uij + ujk − uik = 1, we have

uhi ≥ uij + ujh − 1 ≥ uij + ujk + ukh − 2 = uik + ukh − 1

Thus, the triangle inequality of the tuple (i, k, h) is satisfied,shown as bold in Fig. 8. Once the triangle inequality ofthe tuple (i, j, k) is tight, the triangle inequality of the tuple(i, k, h) will be satisfied for all nodes h. Since the number of

Page 8: IEEE/ACM TRANSACTIONS ON NETWORKING 1 On the Discovery …mythai/files/ToN-CND.pdf · number of network elements (links or nodes) fail undesirably. We refer to these elements as critical

Y. SHEN et al.: ON THE DISCOVERY OF CRITICAL LINKS AND NODES FOR ASSESSING NETWORK VULNERABILITY 8

triangle inequality constraints is 3(n3

)= O(n3), the number

of active constraints will be O(n3)/n = O(n2) after pruningprocess.

VII. PERFORMANCE EVALUATION

In this section, we evaluate the performance of our HILPRalgorithm on different types of synthetic and real-world net-works. The simulation is implemented using the CPLEXoptimization suite from ILOG, which includes the simplexmethod [19], the branch & bound algorithm, and advancedcutting-plane techniques [32].

A. Performance of the HILRP Algorithm

The three networks we use to evaluate the performance ofour proposed HILPR algorithm are described as follows:

1) The real terrorist network compiled by Krebs [23] with62 nodes and 153 links, which reflects the relationshipbetween the terrorists involved in the terrorism attacks ofSep. 11, 2001. This experiment attempts to evaluate theperformance of HILPR on a real-world social network.In order to breakdown the terrorist network, we cancapture the individuals corresponding to the criticalnodes identified by HILPR.

2) Waxman network topology, a widely-accepted InternetAS topological model, is generated by the well-knownBRITE [26].

3) Power-law network topology, generated by Barabasigraph generator [2], has been discovered as one ofthe most remarkable properties in many large-scalenetworks such as the Internet and the social networks.

To keep the similar density as the real terrorist network andalso show the comparison with optimal solutions, we usethe instance with 70 nodes and 140 links. We generate 100instances for both Waxman and power-law models and showthe average results.

In order to show the effectiveness of our proposed HILPRalgorithm, we compare it with the optimal solution obtainedby solving the ILP directly. We also compare HILPR with twocentrality approaches: degree centrality (DC) and betweennesscentrality (BC), which are often used in network analysis [10].In DC, the k links and nodes of largest degrees are selectedas critical links and nodes, where the degree of a link (u, v)is defined as d(u) + d(v). Similarly, in BC, the k links andnodes with largest betweenness are selected as critical linksand nodes, where the betweenness of a link or a node is definedas the number of shortest paths among all pairs of nodes thatpasses through it. For CND, we further compare HILPR withCNLS approach proposed by Arulselvan et al. [9], which alsoaims to minimize the pairwise connectivity.

As the only free parameter in our algorithm, we firstcompare the impacts of different γ values in our experimentssuch that we can balance the solution quality and runningtime by carefully selecting this experimental value γ. Asillustrated in Fig. 9, the results returned by our algorithm arevery close solutions. Thus, we use γ = 1 for CND due to itsslightly better performance, and γ = 5 for CLD to reduce therunning time since the number of critical links is usually larger

compared with critical nodes. Next, we show that our HILPRapproach returns a very-near optimal solution and outperformsother approaches.

0

100

200

300

400

500

600

700

800

900

1000

0 20 40 60 80 100

Pai

rwis

e C

onne

ctiv

ity

The Value k

γ=1γ=4γ=7γ=10

(a) Critical Links

0

200

400

600

800

1000

1200

1400

1600

1800

0 5 10 15 20 25

Pai

rwis

e C

onne

ctiv

ity

The Value k

γ=1γ=2γ=3γ=4

(b) Critical Nodes

Fig. 9. The performance of HILPR using different γ in terrorist network

Fig. 10 and Fig. 11 report the comparison of the aboveHILPR algorithm and centrality algorithms for CLD and CNDon the above three different networks. In these figures, wenotice that the solution of HILPR algorithm is very closelyapproaching the optimal solution for both CLD and CNDon all these three networks (Note that a portion of optimalsolutions of CND in Waxman networks are missing due to itsextremely high computational complexity, which is becausethe network is neither almost intact nor almost fragmented).The pairwise connectivity derived from degree centrality algo-rithms is much worse than HILPR algorithm because the linksor nodes of higher degrees could already connect other criticallinks or nodes and therefore are not necessary to be counted ascritical any more. For instance, the hub nodes (nodes of highdegree) in power-law networks are not necessarily connectedwith each other such that the removal of two hub nodes couldbe less effective to reduce the pairwise connectivity than theremoval of two other nodes which can disconnect the network.The betweenness centrality performs worst due to the lackof all paths information rather than only shortest paths. Thatis, a pair of nodes can still be connected even when onlythe shortest path between them is destroyed. The reason whyour HILPR algorithm outperforms CNLS is mainly becauseof the different strategy to choose the initial critical nodesbefore doing the local search. The CNLS method is only basedon the maximum degree from a maximal independent set,hoping that the removal of these nodes can greatly fragmentthe network. However, this is not always true since many nodesin the maximal independent set are usually of low degree,and consequently do not play an important role in destroyingthe network. In our HILPR approach, by solving the LP androunding the top elements iteratively, we take into account allpossible paths and connections between different nodes suchthat the critical elements can be accurately identified.

0

20

40

60

80

100

0 5 10 15 20 25

Ove

rlapp

ing

Crit

ical

No

des

(%)

The Value k

Critical Nodes

Fig. 12. Overlapping critical nodes between optimal solution and HILPR interrorist network

Page 9: IEEE/ACM TRANSACTIONS ON NETWORKING 1 On the Discovery …mythai/files/ToN-CND.pdf · number of network elements (links or nodes) fail undesirably. We refer to these elements as critical

Y. SHEN et al.: ON THE DISCOVERY OF CRITICAL LINKS AND NODES FOR ASSESSING NETWORK VULNERABILITY 9

0

500

1000

1500

2000

2500

0 20 40 60 80 100

Pai

rwis

e C

onne

ctiv

ity

The Value k

HILPRDCBC

Optimal

(a) Waxman networks

0

500

1000

1500

2000

2500

0 20 40 60 80 100

Pai

rwis

e C

onne

ctiv

ity

The Value k

HILPRDCBC

Optimal

(b) Power-law networks

0

200

400

600

800

1000

1200

1400

1600

0 20 40 60 80 100

Pai

rwis

e C

onne

ctiv

ity

The Value k

HILPRDCBC

Optimal

(c) Terrorist network

Fig. 10. The performance evaluation of HILPR against the degree and betweenness centrality algorithms for the CLD problem

0

500

1000

1500

2000

2500

0 5 10 15 20 25 30 35

Pai

rwis

e C

onne

ctiv

ity

The Value k

HILPRCNLS

DCBC

Optimal

(a) Waxman networks

0

500

1000

1500

2000

2500

0 5 10 15 20 25 30

Pai

rwis

e C

onne

ctiv

ity

The Value k

HILPRCNLS

DCBC

Optimal

(b) Power-law networks

0

200

400

600

800

1000

1200

1400

1600

1800

0 5 10 15 20 25

Pai

rwis

e C

onne

ctiv

ity

The Value k

HILPRCNLS

DCBC

Optimal

(c) Terrorist network

Fig. 11. The performance evaluation of HILPR against the degree and betweenness centrality, and CNLS algorithms for the CND problem

Specifically, in order to further show the effectiveness of ourmetric and algorithm, i.e., the critical links and nodes in real-world networks can be correctly detected using our algorithm,we dig into the real terrorist network in which the identitiesof nodes are available. The results returned by our HILPRalgorithm show that we can detect the real important personnelby minimizing total pairwise connectivity. For instance, thetwo nodes 37 and 48 in the terrorist network, which have beenshown as the leaders in [1], [23], can be correctly detectedusing our HILPR algorithm as long as k = 2. Yet, if we usedegree and betweenness centrality methods, only node 37 canbe detected when k = 2 and node 48 will not be detected untilk is chosen to be 6 and 5 respectively.

Even though our objective is to minimize the pairwise con-nectivity, we are still interested in the overlapping percentageof critical nodes our HILPR algorithm returns and the optimalcritical nodes. As reported in Fig. 12, in the real terroristnetwork, optimal critical nodes can be 100% successfullydetected using our HILPR algorithm in more than 1/3 casesfor different k values. The average overlapping percentage isaround 80% since there exist some nodes playing the same rolein network connectivity such that the pairwise connectivity stillcan be minimized although our HILPR algorithm identifiesdifferent critical nodes from the optimal solution.

Moreover, the running time of our HILPR algorithm isless than 5 seconds in all these three networks, for detectingeither critical links or nodes, which is only slightly worse thancentrality algorithms (1-2 seconds) and CNLS algorithm (2-3 seconds). Especially when k is small, i.e., only the mostcritical elements are required to be detected, our algorithmcan finish around 3 seconds, which further illustrates theeffectiveness of our HILPR algorithm in terms of both solutionquality and running time.

B. Metric Evaluation

We evaluate the residual network obtained by HILPR algo-rithm under various network vulnerability metrics. As has beenshown in Fig. 10 and 11, degree centrality and betweenness

centrality cannot accurately reflect the network vulnerability.Therefore, we focus on the following three other metrics:(1) average shortest path length (ASP) between each node-pairs (the shortest distance is 0 if the pair of nodes arenot connected), (2) average available flows (AAF) betweeneach node-pairs, and (3) global clustering coefficients (GCC)defined as #closed triplets

#connected triples of vertices , in which a closedtriplet consists of three nodes that are connected by threeundirected ties.

Particularly, we are interested to see how the values of thesemetrics change in the residual network after we remove thecritical elements which can successfully reduce the pairwiseconnectivity of the network. Since our HILPR algorithmcan successfully detect the real critical links and nodes asdiscussed in the previous subsection, we confidently evaluatethe above three metrics on the residual graphs obtained byHILPR algorithm. Fig. 13 shows the changes in values ofthe above three metrics after removing different number ofcritical links or nodes. Unfortunately, none of these threemetrics in residual networks can clearly cast the networkvulnerability. As for the ASP, we can only consider the ASPwithin each connected component after the network is dis-connected; otherwise the ASP becomes infinite and thereforefails to measure the network vulnerability. However, the valueof ASP within connected components is either irregular (Fig.13(a)) or contrary to the intuition, i.e., ASP usually increaseswith after removing critical elements (Fig. 13(b)). Similarly,the AAF fails to assess the network vulnerability due to itsirregularity for critical links. The monotonous decease of AAFin the residual networks after removing critical nodes is greatlydue to the disconnection of the network, which reduces theflow from two nodes in different connected components to0. However, the nodes disconnecting the network are notnecessary to be critical nodes. At last, the variation of GCCvalues is irregular for both critical links and nodes due to thesimultaneous decrease of the number of connected triples ofvertices. Particularly, when the network is highly fragmented,

Page 10: IEEE/ACM TRANSACTIONS ON NETWORKING 1 On the Discovery …mythai/files/ToN-CND.pdf · number of network elements (links or nodes) fail undesirably. We refer to these elements as critical

Y. SHEN et al.: ON THE DISCOVERY OF CRITICAL LINKS AND NODES FOR ASSESSING NETWORK VULNERABILITY 10

TABLE IPAIRWISE CONNECTIVITY OF VARIOUS NETWORKS (%)

Networks Critical Links (%) Critical Nodes (%)5 10 15 20 3 6 9

Real-world Networks

Greg’s Cable Network 9.7 4.8 2.3 1.5 7.7 2.8 1.1US Network Assets 44.8 32.5 22.4 18.5 49.6 17.7 9.6Terrorist Network 52.5 40.3 33.9 23.3 44.8 31.1 21.0US Airline 1997 46.2 30.1 17.9 13.5 47.5 21.4 10.5

Avg. Degree

1 9.8 6.5 4.5 3.4 5.1 1.9 1.1Power-Law Networks 2 90.6 71.8 59.1 44.2 87.1 70.5 55.7

(150 nodes) 3 94.7 76.6 65.8 57.5 92.1 82.1 70.54 98.1 81.4 70.9 60.6 94.7 88.3 82.1

Avg. Degree

1 4.8 1.9 1.0 0.6 8.0 2.8 1.7Waxman Networks 2 71.1 53.2 29.5 17.3 90.8 75.3 62.1

(150 nodes) 3 85.3 77.6 62.3 53.0 93.4 80.9 73.94 89.6 81.2 67.8 63.5 94.7 87.1 82.2

this metric can easily become infinite and meaningless (in theresidual network of k ≥ 22 as shown in Fig. 13(b)) since thenumber of connected triples of vertices becomes 0.

0

50

100

150

200

0 10 20 30 40 50 60 70 80 90 100

The

Cha

nges

in V

alue

s

The Value k

Average Shortest Path (ASP)(%)Average Available Flow (AAF)(%)

Global Clustering Coefficients (GCC)

(a) Critical Links

-20

0

20

40

60

80

100

0 5 10 15 20 25

The

Cha

nges

in V

alue

s

The Value k

Average Shortest Path (ASP)(%)Average Available Flow (AAF)(%)

Global Clustering Coefficients (GCC)

(b) Critical Nodes

Fig. 13. The comparison of different metrics on terrorist network

C. Case Studies

With the effectiveness of HILPR observed through theabove experiments, we continue using our HILPR to furtheranalyze more real-world traces, as shown in Table II andexploit some insight properties with respect to the networkvulnerabilities. Among which, apart from the terrorist network,we obtain the US Airline 1997 network from Newman’sdataset [3], and compile the US Network Assets and Greg’sCable Network according to the data available online [5],[4] respectively. Specifically, the US Airline 1997 networkillustrates the airlines connections in the USA in 1997. The USNetwork Assets is a wealth of network assets that ensure tohandle current customer needs in XO Communications serviceand Greg’s Cable Network is a global cable network with thelandings as its nodes.

TABLE IIREAL-WORLD NETWORKS

Networks Nodes Links Avg. DegreeGreg’s Cable Network 570 731 1.28

US Network Assets 71 98 1.38Terrorist Network 62 153 2.46US Airline 1997 332 2126 6.40

As reported in Table I, Greg’s Cable Network is mostvulnerable as it is the sparsest, i.e., the smallest average degree,and Terrorist Network is more robustness than US NetworkAssets due to its higher density. Surprisingly, although bothserve as cable networks, Greg’s Cable Network is much morevulnerable than US Network Assets although the averagedegree of Greg’s Cable Network is only slightly lower. The

underlying reason is that most of the cables connect very fewhub landings, which are easily attacked so as to fragment thewhole network.

Table I further investigates the vulnerabilities of Waxmannetworks and power-law networks with different average de-grees, which helps us to design more robust complex net-works with respect to the estimated disruptions. Based onthe experimental results, we can draw some significant andinteresting observations of the vulnerability degree for variousnetwork topologies. The vulnerabilities of Waxman networksand power-law networks are dependent on the removal ofcritical links or nodes. On one hand, compared with power-lawnetworks, Waxman networks are more robust as the numberof pairwise connectivity of Waxman networks is much higherthan that of power-law networks after removing the same num-ber of critical nodes. The robustness of Waxman model derivesfrom the similarity of density in different components suchthat there are no significantly important nodes. On the otherhand, Waxman networks are more vulnerable than power-lawnetworks after removing the same number of critical links.This is due to the lack of large dense components. Thatis, the removal of critical links in Waxman networks candecompose the original components more easily than in power-law networks. These observations on the relations between theaverage degree in a network and its vulnerability help us todesign a network so as to not only reduce the cost but toleratethe threats as well.

VIII. CONCLUSION

In this paper, we study the optimization problems of de-tecting critical links and nodes to assess the network vulnera-bility based on the recent proposed metric, the total pairwiseconnectivity. We have performed a comprehensive analysis oncomputational complexity of these two optimization problemsin various type of networks, ranging from unit disk graphsto power-law graphs, which correspond to the wireless sensornetworks and real large-scale networks. Because of the NP-completeness, we further analyzed its inapproximablity andprovided the HILPR algorithm which is based on the noveliterative LP rounding method, along with the local search andconstraint pruning techniques. The experiments not only showthe effectiveness of our algorithm but also reveal the vulnera-bility degree of different real-world and synthetic networks.

Page 11: IEEE/ACM TRANSACTIONS ON NETWORKING 1 On the Discovery …mythai/files/ToN-CND.pdf · number of network elements (links or nodes) fail undesirably. We refer to these elements as critical

Y. SHEN et al.: ON THE DISCOVERY OF CRITICAL LINKS AND NODES FOR ASSESSING NETWORK VULNERABILITY 11

ACKNOWLEDGEMENT

This work is partially supported by NSF Career Award0953284, DTRA, Young Investigator Award, Basic ResearchProgram HDTRA1-09-1-0061.

REFERENCES

[1] http://en.wikipedia.org/wiki/september 11 attacks.[2] http://networkx.lanl.gov/reference/generated/networkx.generators.

random graphs.barabasi albert graph.html.[3] http://vlado.fmf.uni-lj.si/pub/networks/pajek/data/gphs.htm.[4] http://www.cablemap.info/.[5] http://www.xo.com/about/network/pages/maps.aspx.[6] R. Albert, I. Albert, and G. L. Nakarado. Structural vulnerability of the

north american power grid. Phys. Rev. E, 69(2), Feb 2004.[7] R. Albert, H. Jeong, and A. Barabasi. Error and attack tolerance of

complex networks. Nature, 406(6794):378–382, 2000.[8] R. Albert, H. Jeong, and A. Barabasi. Error and attack tolerance of

complex networks. Nature, 406(6794):378–382, July 2000.[9] A. Arulselvan, C. W. Commander, L. Elefteriadou, and P. M. Pardalos.

Detecting critical nodes in sparse graphs. Comput. Oper. Res., 36:2193–2200, July 2009.

[10] S. P. Borgatti and M. G. Everett. A graph-theoretic perspective oncentrality. Social Networks, 28(4):466 – 484, 2006.

[11] B. N. Clark, C. J. Colbourn, and D. S. Johnson. Unit disk graphs.Discrete Mathematics, 86(1-3):165–177, December 1990.

[12] E. Dahlhaus, D. S. Johnson, C. H. Papadimitriou, P. D. Seymour, andM. Yannakakis. The complexity of multiterminal cuts. SIAM Journalon Computing, 23:864–894, 1994.

[13] T. Dinh, Y. Xuan, M. Thai, P. Pardalos, and T. Znati. On new approachesof assessing network vulnerability: Hardness and approximation. Net-working, IEEE/ACM Transactions on, 20(2):609 –619, april 2012.

[14] S. Eubank, V. S. A. Kumar, M. V. Marathe, A. Srinivasan, and N. Wang.Structural and algorithmic aspects of massive social networks. InProceedings of the fifteenth annual ACM-SIAM symposium on Discretealgorithms, SODA ’04, pages 718–727, Philadelphia, PA, USA, 2004.Society for Industrial and Applied Mathematics.

[15] A. Ferrante, G. Pandurangan, and K. Park. On the hardness ofoptimization in power-law graphs. Theory. Computer Science, 393(1-3):220–230, 2008.

[16] C. Gkantsidis, M. Mihail, and A. Saberi. Conductance and congestionin power law graphs. In Proceedings of the 2003 ACM SIGMETRICSinternational conference on Measurement and modeling of computersystems, SIGMETRICS ’03, pages 148–159, New York, USA, 2003.ACM.

[17] F. Glover and M. Laguna. Tabu search, 1997.[18] J. Hastad. Clique is hard to approximate within n1−ε. Electronic

Colloquium on Computational Complexity (ECCC), 4(38), 1997.[19] F. S. Hillier and G. J. Lieberman. Introduction to operations research,

4th ed. Holden-Day, Inc., San Francisco, CA, USA, 1986.[20] Hopcroft, J. E. and Karp, R. M. An n5/2 algorithm for maximum

matchings in bipartite graphs. SIAM J. Comput., 2:225–231, 1973.[21] D. Konig. Grafok es matrixok. Matematikai es Fizikai Lapok, 38:116–

119, 1931.[22] J. Kleinberg. The small-world phenomenon: An algorithmic perspective.

In in Proceedings of the 32nd ACM Symposium on Theory of Computing,pages 163–170, 2000.

[23] V. E. Krebs. Uncloaking terrorist networks. First Monday, 7(4), 2002.[24] Luciano, F. Rodrigues, G. Travieso, and V. P. R. Boas. Characterization

of complex networks: A survey of measurements. Advances in Physics,56(1):167–242, Aug 2007.

[25] T. C. Matisziw and A. T. Murray. Modeling s-t path availabilityto support disaster vulnerability assessment of network infrastructure.Computers & Operations Research, 36(1):16 – 26, 2009. Part SpecialIssue: Operations Research Approaches for Disaster Recovery Planning.

[26] A. Medina, A. Lakhina, I. Matta, and J. W. Byers. BRITE: Anapproach to universal topology generation. In MASCOTS, page 346.IEEE Computer Society, 2001.

[27] K. Park and H. Lee. On the effectiveness of route-based packetfiltering for distributed dos attack prevention in power-law internets.In Proceedings of the 2001 conference on Applications, technologies,architectures, and protocols for computer communications, SIGCOMM’01, pages 15–26, New York, NY, USA, 2001. ACM.

[28] Y. Shen, N. P. Nguyen, and M. T. Thai. Exploiting the robustness onpower-law networks. In COCOON, pages 379–390, 2011.

[29] M. D. Summa, A. Grosso, and M. Locatelli. Complexity of the criticalnode problem over trees. Computers & OR, 38(12):1766–1774, 2011.

[30] F. Sun and M. A. Shayman. On pairwise connectivity of wirelessmultihop networks. Int. J. Secur. Netw., 2(1/2):37–49, Mar. 2007.

[31] L. G. Valiant. Universality considerations in vlsi circuits. IEEE Trans.Computers, 30(2):135–140, 1981.

[32] L. A. Wolsey. Integer Programming. John Wiley, New York, 1998.

Yilin Shen (S’12) is currently a Ph.D. candidate atthe Department of Computer and Information Sci-ence and Engineering, University of Florida, underthe supervision of Dr. My T. Thai. He receivedthe M.S. degree in Computer Engineering, from theUniversity of Florida, in 2012. His research interestsinclude the vulnerability assessment and security ofcomplex networks, and the design of approximationalgorithms for network optimization problems. He isa student member of the IEEE.

Nam P. Nguyen received the B.S. from VietnamNational University in 2007 and the M.S. degreefrom Ohio University in 2009, both in Mathematics.He is currently working toward the Ph.D. degreein Computer Information Science and Engineering,University of Florida, under the supervision of Dr.My T. Thai. His current research interests includecommunity detection methods for both static anddynamic networks, and effective approximation al-gorithms for networking problems.

Ying Xuan received the Ph.D. degree in Com-puter Information Science and Engineering from theUniversity of Florida, in 2011. His Ph.D. researchfocused on Pooling Theory for Network Securityand Reliability. He is currently a Machine LearningResearcher and Data Scientist in eBay Inc. Hiscurrent research interests include natural languageprocessing, machine learning and data mining.

My T. Thai (M’06) is an Associate Professor at theComputer and Information Science and EngineeringDepartment, University of Florida. She received herPh.D. degree in Computer Science from the Uni-versity of Minnesota in 2005. Her current researchinterests include algorithms and optimization onnetwork science and engineering. Dr. Thai has en-gaged in many professional activities, serving manyconferences, such as being the PC chair of IEEEIWCMC 12, IEEE ISSPIT 12, and COCOON 2010.She is an associate editor of JOCO, IEEE TPDS,

and a series editor of Springer Briefs in Optimization. She has receivedmany research awards including a Provost’s Excellence Award for AssistantProfessors at the University of Florida, a DoD YIP, and an NSF CAREERAward.