Collaborative IDS Conﬁguration: A Two-layer Game-Theoretical Approachhdai/GC16-RJ.pdf · 2016-09-01 · Collaborative IDS Conﬁguration: A Two-layer ... select its detection libraries

Collaborative IDS Configuration: A Two-layerGame-Theoretical Approach

Richeng Jin, Xiaofan He, and Huaiyu Dai, Senior Member, IEEEDepartment of ECE

North Carolina State UniversityEmail: {rjin2,xhe6,hdai}@ncsu.edu

Abstract—As information systems become ubiquitous, Intru-sion Detection Systems (IDSs) have assumed increasing im-portance. As a result, substantial amount of research effortshave been devoted to developing various intrusion detectionalgorithms. However, there is still no single detection algorithmthat can catch all possible attacks. On the other hand, it isinfeasible for practical IDSs to run all the detection algorithmssimultaneously due to resource limitation, leaving potential op-portunities for the adversaries to explore. This resource scarcityproblem becomes more severe when the system is in an illstate (e.g., partially compromised). Enabling collaboration amongmultiple IDSs may be a viable way to mitigate this problem.Particularly, IDSs in the healthy state can share some of theiridle computational resources to those in ill states, so as to improvethe overall intrusion detection performance. Considering this, thecollaborative IDS configuration problem is formulated as a two-layer stochastic game (SG) in this work and a new algorithm isproposed to solve this two-layer SG. Simulation results show thatthe proposed algorithm can provide an effective collaborativeconfiguration scheme, leading to significant detection perfor-mance gain. Some performance analysis has also been given, andthe conditions under which there is a guaranteed improvementin expected system performance have been derived.

I. INTRODUCTION

When adversaries invade an information system, IntrusionDetection Systems (IDSs) that can detect intrusions throughmonitoring the system/network activities or policy violationswill serve as a vital line of defense to protect the importantinformation asset. Despite that many intrusion detection algo-rithms have been developed in literature, no single algorithmcan catch all the intrusions. In the meantime, it is impossiblefor a real-world IDS to run all the existing algorithms simul-taneously due to resource limitation [1]. Because of these,substantial research efforts have been devoted to the IDSconfiguration problem, i.e., how to configure the IDS witha proper library of intrusion detection algorithms for effectiveintrusion detection. Among the others, game theory has beenemployed as an effective tool to analyze the strategic interac-tion between IDSs and attackers, providing useful guidance forIDS configuration(see, e.g., [2–5] and the references therein).Nonetheless, few of them consider collaboration among IDSs.

The performance of traditional IDSs that work indepen-dently may not be satisfactory due to limited information and

This work was supported in part by the NCSU Science of Security Lablet,and in part by National Science Foundation under Grants ECCS-1307949 andEARS-1444009.

resource. To further enhance intrusion detection performance,intrusion detection networks (IDNs) which consist of multiplecollaborative IDSs have been developed in literature [6]. Thecollaboration in existing IDNs is mainly through informationsharing. Particularly, two forms of information sharing areconsidered. In information based IDNs, IDSs share their ob-servations and detection knowledge (e.g., a new type of attack)through message exchange. While in consultation based IDNs,information is shared indirectly. Specifically, when an IDSdoes not have sufficient confidence to make a decision, it maysend consultation requests to other IDSs with more completeinformation and ask them to help detect intrusions [7, 8].

Nonetheless, in many practical situations, the performanceof an IDS is limited solely because it does not have enoughcomputational resource (e.g., CPU time and memory) to run allthe required detection algorithms. For example, when a systemis already in an ill state due to some previous undetected in-trusions, it may not have sufficient resource to support the IDSto run all the detection algorithms suggested by the optimalconfiguration. In this case, the information based IDNs will nothelp much, since the performance of IDS is now limited bycomputational resource rather than detection knowledge. Thusin this work, we will focus on the consultation based IDNs, andparticularly, consider computational resource sharing in IDN.The basic idea is that an ill IDS can send consultation requeststo other IDSs and ask them to use their local computationalresource to run the corresponding detection algorithms. Whena healthy IDS receives the request, it can allocate its redundantresource to help the ill one. Likewise, when the ill IDSrecovers, it can also offer its resource to help other IDSswhen needed. This leads to a collaborative dynamic IDSconfiguration problem studied in this paper.

In this work, a two-layer stochastic game (SG) is pro-posed to model this collaborative dynamic IDS configurationproblem and guide the design of the corresponding effectivecollaboration strategy. Specifically, the first layer deals withthe interaction between each IDS and the corresponding at-tacker. In this interaction, both the IDSs and the attackerscan use learning algorithms (e.g., Q-learning [4]) to graduallylearn their own strategies. The second layer deals with thecollaboration among IDSs in the IDN. In practice, the IDSsmay need incentives to promote collaboration. In this work,it is assumed that there is an IDN manager who will helpto determine a resource allocation strategy that can maximize

TABLE IIMPORTANT NOTATIONS

N the set of IDSs.si the state of subsystem i.Ui(s

i) the number of libraries IDSi can load at state si

L the set of detection libraries.A the set of attacks.li the configuration of IDSi.ai the action of attacker γi.

pln,amthe probability of attack am been detected by libraryln.

pst state transition probability.

wIDSi

si,ak

the importance of detecting attack ak for IDSi atstate si.

wγisi,ak

the profit of fulfilling attack ak for attacker γi atstate si.

gij the amount of resource that IDSi shares IDSj .πIDS the strategy of IDS.πγ the strategy of attacker.πγ the attacker’s strategy estimated by IDS.

Fig. 1. Block diagram of an IDN.

the overall detection performance of the IDN. To achieve asocially optimal allocation, a Vickrey-Clarke-Groves (VCG)auction [9] is used in this work.

The remainder of this paper is organized as follows. Sec-tion II formulates the collaborative IDS configuration problem.The proposed two-layer algorithm is presented in Section III.The performance of the proposed method is analyzed in Sec-tion IV and its effectiveness is examined through simulationsin Section V. Related works are discussed in Section VI.Conclusions and future works are presented in Section VII.

II. PROBLEM FORMULATION

In this section, a two-layer SG is formulated for the col-laborative IDS configuration problem. Table I shows theimportant notations used in this paper. An IDN that consistsof N IDSs is considered, denoted by N = {1, 2, ..., N}.Without loss of generality, it is assumed that each IDSfaces one attacker that can launch multiple attacks simul-taneously. Fig. 1 depicts the scenario of three fully con-nected IDSs facing three independent attackers. Let S ={s1, s2, ..., sN , U1(s1), U2(s2), ...UN (sN )} denote the systemstate, in which si represents the state of subsystem i, andUi(s

i) represents the number of libraries IDSi can load whichdepends on the current state of subsystem i. In addition,it is assumed that each subsystem has two possible states

{H, I}, where H stands for the healthy state and I standsfor the ill state. Let L = {l1, l2, ..., lL} denote the set ofdetection libraries that are available to all the IDSs. Denotedby L∗ = σ(L) the power set of L, with cardinality |L∗| = 2L.Each li ∈ L∗ is a possible configuration of IDSi.

On the other hand, it is assumed that each of the at-tackers (denoted by γ) can launch L different attacks A ={a1, a2, ..., aL}, in which attack am will be detected by libraryln with probability pln,am . The detection probability pln,am isassumed high when the library matches the attack (i.e., m = n)and low otherwise. Similarly, each attacker γi can launch a setof attacks ai ∈ A∗, with A∗ = σ(A) denoting the power setof A.

To model the influence of library configuration li andattacks ai on the subsystem state, it is assumed that thesubsystem state will transit from si to si

′, with probability

pst(si′ |si, li,ai). When the subsystem is in state si and IDSi

loads the set of libraries li while the attacker γi takes the setof attacks ai, the reward function of IDSi is modeled as

RIDSi(si, li,ai) =∑ak∈ai

pli,akwIDSi

si,ak, (1)

where wIDSi

si,akrepresents the importance of detecting attack

ak when subsystem i is in state si, and pli,ak refers tothe probability of detecting ak when IDSi loads the set oflibraries li, which is given by

pli,ak = 1−∏ln∈li

(1− pln,ak). (2)

Note that the modeling in (1) conforms to the intuition that theIDS can obtain better reward when it loads the right library.The term wIDSs,ak

is motivated by the fact that the damagecaused by the same attacks at different system states may bedifferent. In this work, we consider homogeneous IDSs andattackers.

The reward function of the attacker γi is modeled as

Rγi(si, li,ai) =∑ak∈ai

(1− pli,ak)wγisi,ak , (3)

where wγisi,ak denotes the profit of fulfilling attack ak forattacker γi when subsystem i is at state si. To maximize itsreward, IDSi has to form a strategy πIDSi(si, li) so as toselect its detection libraries that match the potential attacksfrom the attacker γi.

In the considered collaborative IDS configuration problem,the connected IDSs are allowed to cooperate through resourcesharing. Let gij be the amount of resource that IDSi sharesIDSj , for i, j ∈ N . The resource that can be shared by eachIDS is restricted by its state-dependent capacity, i.e.,∑

j∈Ngij 6 Ui(s

i),∀i ∈ N , (4)

III. THE PROPOSED METHOD

The proposed method for collaborative IDS configuration isillustrated in this section. In each round of the proposed

Fig. 2. Overview of the proposed approach.

scheme, the IDSs will first interact with their correspondingattackers to learn their strategies, and then all the IDSs willreport to the IDN manager the amount of resource they have aswell as their expected reward functions. It is assumed that theIDN manager is secure and able to verify the messages fromIDSs and thus no IDS will send false information. The IDNmanager then determines a resource allocation scheme for abetter overall detection performance. Fig. 2 gives an overviewof the proposed method.

A. First Layer: Stochastic Game

The first layer is concerned with the interaction between eachIDS and its corresponding attacker. The objective of each IDS(attacker) is to maximize its cumulative discounted rewardE{∑∞n=1 β

nRIDSn } (E{∑∞n=1 β

nRγn}) with discounting fac-tor β ∈ [0, 1) representing its long-term performance withdiminishing weighting on the future. The IDS (attacker) needsto learn a strategy πIDS(s, l) (πγ(s,a)) which specifies theprobability of taking action l ∈ L∗ (a ∈ A∗) at a givenstate s. To this end, the interaction can be formulated as aSG as follows: the IDS and the attacker are the two players;the set of possible subsystem states defines the state spaceof the stochastic game; σ(A) and σ(L) are the action spaceof the IDS and the attacker; and the state transition functionpst(s

′|s, l,a) defines the probability of reaching a future stategiven the current state and actions of both the IDS and theattacker. Interested readers may refer to [10] for more detailson SG.

The Nash-Q learning algorithm [10] can be employedto solve the SG described above. At each time slot n,after observing the actions (ln,an), the reward rIDSn =RIDS(sn, ln,an), and the state transition from sn to sn+1,the IDS updates the quality and the value functions QIDS

and V IDS for itself. It will also maintain a pair of virtualquality functions Qγ and V γ , to keep track of the attacker’sbehavior. In particular, these quantities are updated as follows:

QIDSn+1 (s, l,a) =(1− αn)QIDSn (s, l,a) + αn

[rIDSn + β · V IDSn (sn+1)

], for (s, l,a) = (sn, ln,an),

QIDSn (s, l,a), otherwise ,

(5)

Qγn+1(s, l,a) =(1− αn)Qγn(s, l,a) + αn

[Rγ(s, l,a) + β · V γn (sn+1)

], for (s, l,a) = (sn, ln,an),

Qγn(s, l,a), otherwise ,

(6)

V IDSn+1 (s) = NASHIDS(QIDSn+1 (s, ·, ·), Qγn+1(s, ·, ·)), (7)

V γn+1(s) = NASHγ(QIDSn+1 (s, ·, ·), Qγn+1(s, ·, ·)). (8)

The updated strategies of the IDS and attacker are given by

πIDSn+1 (s, ·) = arg NASHIDS(QIDSn+1 (s, ·, ·), Qγn+1(s, ·, ·)), (9)

πγn+1(s, ·) = arg NASHγ(QIDSn+1 (s, ·, ·), Qγn+1(s, ·, ·)). (10)

Note that the optimal quality function QIDS (Qγ)represents the total expected discounted reward of theIDS (attacker) attained by taking action l (a) giventhe state s and attacker’s action a (IDS’s action l).In (7) and (9), NASHIDS(QIDSn+1 (s, ·, ·), Qγn+1(s, ·, ·)) andarg NASHIDS(QIDSn+1 (s, ·, ·), Qγn+1(s, ·, ·)) give the rewardand the corresponding strategy of the IDS at a NE of an equiva-lent single stage game with corresponding reward functions ofthe two players specified by the two matrices QIDSn+1 (s, ·, ·) andQγn+1(s, ·, ·), respectively. The reward and the correspondingstrategy of the attacker are defined similarly.

It has been shown in [10] that with a suitable learningrate αn in (5) and (6), the learned quantities in the Nash-Qalgorithm converge to the corresponding optimal ones undercertain sufficient conditions. The attacker can learn its optimalstrategy through a similar procedure.

B. Second Layer: VCG Auction for Resource Allocation

The second layer is concerned with the resource sharing inthe IDN. In this work, a VCG auction algorithm is employed,which can be described as follows: First, given the learnedstrategies from the first layer, each IDSi reports to the IDNmanager the number of libraries it can load currently (i.e.,Ui(s

i)) and the expected reward function given by

RIDSi(si, li(Ui), πγi) =

∑ak∈A

πγi(si, ak)pli,akwIDSi

si,ak, (11)

in which πγi(si, ak) denotes the estimated probability thatthe attacker γi will launch attack ak when the subsystemi is in the state si. The IDN manager then computes theresource allocation U

′= (U

′

1, ..., U′

N ) that maximizes thetotal expected reward of all the IDSs by solving the followingproblem:

Rmax = maxgij

∑u

RIDSu(su, lu(U′

u), πγu)

s.t.∑j

gij 6 Ui(si)

U′

i = Ui(si) +

∑j

gji −∑j

gij

gij ∈ N, ∀i, j ∈ N .

(12)

Furthermore, the IDN manager computes the maximum totalexpected reward if IDSu is excluded from the auction, i.e.,Rmax/u for each u ∈ N . The manager then charges IDSu bythe amount given by

Rmax/u −∑i 6=u

RIDSi(si, li(U′

i ), πγi). (13)

Note that for the IDSs who share their resource with the otherIDSs, the charges will be negative, which means that the IDNmanager will pay them for their effort. The proposed methodis summarized in Algorithm 1.

Algorithm 1 Collaborative Nash-Q Learning for IDS Config-uration

1. Initialization: {QIDS0 } = 0, {Qγ0} = 0, {V IDS0 } = 0,{V γ0 } = 0 and πIDS0 , πγ0 are uniformly distributed.2. Each IDS takes action ln at current state sn• uniformly at random with probability pexplr;• otherwise, with probability {πIDSn (sn, ln)}.

3. Learning: after receiving the reward {rIDSn } and observ-ing the system state transition from sn to sn+1, each IDS• update QIDS and Qγ using (5) and (6), respectively.• update V IDS , πIDS , V γ and πγ using (7), (9), and (8),

(10) respectively.4. Run Algorithm 2.5. Repeat.

Algorithm 2 Resource Allocation Algorithm1. Each IDS senses the current system state Sn ={s1n, s

2n, ..., s

Nn , U1(s1

n), U2(s2n), ...UN (sNn )} and obtains the

strategies of both its attacker and itself.2. Each IDS reports to the IDN manager the amount ofresource it has and the expected reward function (11).3. After receiving all the reports from IDSs, the IDNmanager computes the optimal resource allocation using(12).4. The manager computes the charges for all the IDSs using(13).

IV. PERFORMANCE ANALYSIS

In this section, some analytical results of the proposed al-gorithm are presented. Throughout the entire analysis, thefollowing assumptions are made to facilitate the discussion,without loss of generality. We mainly focus on the case thatonly those healthy IDSs are allowed to share their redundantresource with others (either healthy or ill). To simplify thepresentation, it is assumed that the detection probability of li-brary lm against attack ak is q1 when m = k and q2 otherwise(i.e., plm,ak = q1 if m = k, and plm,ak = q2 otherwise), withq1 > q2 > 0. It is further assumed that the first m IDSs (i.e.,IDS1, IDS2, ..., IDSm) are in the healthy state and can loadU1 = U2 = · · · = Um = UH libraries at a time, while theother IDSs (i.e., IDSm+1, IDSm+2, ..., IDSN ) are in the illstate and can only load Um+1 = Um+2 = · · · = UN = UI

libraries simultaneously. Following the same procedure below,the above assumptions may be relaxed and similar results canbe obtained for more general cases.

In the following discussion, let πγij = πγi(si, aj) denote theestimated probability of attacker γi launching attack aj , andwIDSij = wIDSi

si,ajdenote the importance of detecting attack aj

for IDSi. For ease of presentation, define eij = πγij wIDSij

(here si is omitted in the notation because in the resourceallocation process, the states of all the subsystems remainunchanged), which admits

L∑j=1

πγij wIDSij =

L∑j=1

eij = ki, (14)

where ki is a constant if wIDSij and πγij are known and fixed

for 1 ≤ j ≤ L. Without loss of generality, the followingordering ei1 > ei2 > ... > eiL is assumed, ∀i ∈ N (due toassumed homogeneity). Then, it is not difficult to realize thatIDSi is expected to achieve the highest reward if it loads thefirst Ui libraries, i.e., l1, l2, ..., lUi

. As a result, the expectedreward of IDSi before resource allocation can be express as

E{RIDSi |Ui} =Ui∑j=1

eij [1− (1− q1)(1− q2)Ui−1

]

+L∑

j=Ui+1

eij [1− (1− q2)Ui ].

(15)

Similarly, the expected reward of IDSi after resource alloca-tion can be expressed as

E{RIDSi |U ′

i} =U

′i∑

j=1

eij [1− (1− q1)(1− q2)U

′i−1

]

+L∑

j=U′i +1

eij [1− (1− q2)U

′i ],

(16)

in which U′

i = Ui +∑j gji −

∑j gij denotes the number of

libraries IDSi can load after resource allocation.Proposition 1: Let

I = max1≤t≤n

{q2(1− q2)kt−

(q1 − q2)[Ut∑j=1

etj − (1− q2)Ut+1∑j=1

etj ]}(1− q2)Ut−1,(17)

D = min1≤t≤m

{q2kt−

(q1 − q2)[(1− q2)−1Ut−1∑j=1

etj −Ut∑j=1

etj ]}(1− q2)Ut−1.

(18)Then the resource allocation scheme will lead to improve-

ment in terms of expected reward if and only if the followingcondition holds:

I > D. (19)

Sketch of Proof: For the ease of presentation, define

XUii ,

Ui∑j=1

eij . Let ∆j =∑i gij −

∑i gji denote the amount

of resource that IDSj receives from (or gives to) other IDSs,which satisfies

∑nj=1 ∆j = 0. Further define

fj(∆j) = [(1− q2)kj − (q1 − q2)XUj

j ]

−[(1− q2)kj − (q1 − q2)XUj+∆j

j ](1− q2)∆j ,(20)

Then according to (15) and (16), the difference between theIDS’s rewards before and after resource allocation is given by

n∑i=1

E{RIDSi |U ′

i}−n∑i=1

E{RIDSi |Ui}

=n∑j=1

fj(∆j)(1− q2)Uj−1.(21)

When ∆j > 0, it is easy to see that XUj+∆j

j > XUj

j and(1− q2)∆j < 1, thus fj(∆j) > 0. Similarly, when ∆j < 0,fj(∆j) < 0.

When there exists j such that ∆j 6= 0, some terms in (21)will be positive and the others will be non-positive. Thus,(21) > 0 (i.e., there is performance improvement) if and onlyif the sum of the absolute values of the positive terms arelarger than that of the negative terms. Further checking theproperty of function fj , it can be verified that

fj(l + 2)− fj(l + 1) < fj(l + 1)− fj(l), (22)

which indicates that as l increases, the increasing rate of thefunction fj is decreasing. Thus, the necessary and sufficientcondition for (21) > 0 can be transformed into

max1≤j≤n

|fj(∆j = 1)(1− q2)Uj−1| >

min1≤j≤m

|fj(∆j = −1)(1− q2)Uj−1|, (23)

which is equivalent to (19).Remark 1: In (17) and (18), I represents the maximum

performance gain of IDN if one of the IDSs receives one unitof resource (one library), while D represents the minimumperformance loss of IDN if one of the IDSs gives out oneunit of resource. Assume that in a resource allocation schemewhich leads to performance improvement, some IDSs give outr units of resource in total, while some other IDSs receivethe r units of resource. The concavity of the functions fj(according to (22)) indicates that the total performance lossof the IDSs that give out resource is at least rD, whilethe total performance gain of the other IDSs is at most rI .The necessary condition for performance improvement is thusrI > rD, which explains the necessity of (19). On the otherhand, when I > D, it means that performance gain can beobtained by simply exchanging one unit of resource, whichverifies the sufficiency of (19).

The following corollary considers a special and yet impor-tant case.

Corollary 1: When q2 = 0, i.e., mismatched libraries haveno chance to detect attacks successfully, the resource allocationscheme will lead to improvement in terms of expected rewardif and only if the following condition holds:

max1≤t≤n

{etUt+1} > min1≤t≤m

{etUt}. (24)

Remark 2: Note that Ut represents the number of librariesthat IDSt could load before resource allocation. Therefore,the extra expected reward that IDSt could gain if one morelibrary can be loaded is given by

∑Ut+1j=1 q1e

tj−∑Utj=1 q1e

tj =

q1etUt+1, according to (15) and (16). Similarly, q1e

tUt

rep-resents the loss of expected reward if IDSt gives out oneunit of resource. When the reward gain for a certain IDS islarger than the loss of another, there will be a performanceimprovement in terms of expected reward by exchanging oneunit of resource. In practice, this condition is fairly easy tosatisfy, especially when some IDSs are facing severe attacks(i.e., the probabilities of attacks or the importance of detectingsuch attacks is high, or equivalently etUt+1 is large) while someother IDSs are facing some mild attacks (i.e., a relatively smalletUt

).

V. SIMULATION RESULTS

This section presents the simulation results to evaluate theeffectiveness of the proposed algorithm in different scenarios.Specifically, in all the scenarios, except where noted, it isassumed that there are three IDSs, with three corresponding at-tackers. It is further assumed that the detection libraries avail-able to all the IDSs are L = {l1, l2, ..., l8}, and the attackinglibraries available to all the attackers are A = {a1, a2, ..., a8}.The detection probability pli,ak of library li against attack akis 0.85 (q1) when i = k and 0.1 (q2) otherwise. It is furtherassumed that an IDS can load 1 and 5 libraries when it isin the ill and the healthy states, respectively, i.e., U(I) = 1,U(H) = 5. The discounting factor β, exploration probabilitypexplr and the learning rate for the Nash-Q learning algorithmsof both IDSs and attackers are chosen according to [5, 10]. Theaverage accumulated reward rIDSn defined below is consideredas the performance metric of interest:

rIDSn =1

n

n∑i=1

rIDSi . (25)

A. Improvement in different scenarios

First, the performance of the proposed algorithm in terms ofthe overall average accumulated reward of the three IDSsin three exemplary cases (labeled as “115”, “135”, “155”,respectively, which represent the number of attacks the threeattackers can launch simultaneously) are examined and com-pared. The importance factors of detecting different attacks forIDS1 and IDS3 are set as wIDS1 = [ 1 1 2 3 1 1 3 15

4 2 1 2 3 1 1 9 ]and wIDS3 = [ 11 12 11 15 17 2 1 1

12 14 23 24 17 1 3 2 ], with the number atthe i-th row and the j-th column representing the importanceof detecting attack aj at state i (i=1 for healthy and i=2 forill). Similarly, the profit factors for attackers are set as wγ1 =[ 1 1 1 3 1 1 2 103 2 1 4 3 1 8 13 ] and wγ3 = [ 11 13 12 12 15 5 3 1

12 17 13 14 16 2 4 1 ].Besides, for IDS1, the corresponding action-dependent statetransition matrices are set as pst(·|·) = [ 1 0

0.5 0.5 ] if thecorresponding attack is detected and pst(·|·) = [ 0.5 0.5

0.2 0.8 ]otherwise, with the number at the i-th row and the j-th columnrepresenting the probability of transiting from state i to state j.For IDS3, the corresponding action-dependent state transition

Time/t ×104

0 1 2 3 4 5 6 7

Avg

. acc

umul

ated

rew

ard

0

2

4

6

8

10

12

14

16

115-With resource allocation115-Without resource allocation135-With resource allocation135-Without resource allocation155-With resource allocation155-Without resource allocation

Fig. 3. The overall average accumulated reward of IDSs.

matrices are set as pst(·|·) = [ 1 00.5 0.5 ] if all the 5 attacks

are detected, pst(·|·) = [ 0.5 0.50.2 0.8 ] if 4 attacks are detected,

pst(·|·) = [ 0.2 0.80.1 0.9 ] if 3 attacks are detected, pst(·|·) =

[ 0.1 0.90.05 0.95 ] if 2 attacks are detected, pst(·|·) = [ 0.05 0.95

0.01 0.99 ]if only 1 attack is detected, and pst(·|·) = [ 0.01 0.99

0 1 ] ifno attack is detected. In the “115” case, the parameters ofIDS2 are set to be the same as IDS1; in the “135” case,the importance factors of detecting different attacks for IDS2

are set as wIDS2 = [ 1 2 1 1 11 12 15 11 1 2 2 15 14 18 2 ]; the profit factors

for the attacker are set as wγ2 = [ 1 3 2 2 15 13 11 11 1 3 1 11 9 9 1 ]; the

corresponding action-dependent state transition matrices areset as pst(·|·) = [ 1 0

0.5 0.5 ] if all the 3 attacks are detected,pst(·|·) = [ 0.5 0.5

0.2 0.8 ] if 2 attacks are detected, pst(·|·) =[ 0.2 0.80.1 0.9 ] if only 1 attack is detected, and pst(·|·) = [ 0.1 0.9

0 1 ]if no attack is detected. In the “155” case, the importancefactors of detecting different attacks for IDS2 are set aswIDS2 = [ 1 2 10 9 11 14 17 1

2 4 15 11 14 17 19 2 ]; the profit factors for theattacker are set as wγ2 = [ 1 1 12 12 15 15 13 1

2 1 13 14 16 2 4 1 ], while thestate transition matrices are set the same as IDS3. The overallaverage accumulated reward of using the proposed algorithmis compared with the original Nash-Q Learning Algorithmwithout the resource allocation step. It can be observed fromFig. 3 that the performance of the proposed algorithm (denotedby “with resource allocation”) significantly outperforms thecounterpart “without resource allocation” in all three cases,with performance improvement ranging from around 30% (forthe “155” case) to 80% (for the “115” case). Similar trendsare observed when different sets of system parameters areused (the results are omitted in the interest of space). Asexpected, the more spare resource the IDSs have, the larger thecollaboration gain is. As can be seen from Fig. 4, in the “115”case, on average IDS1 and IDS2 together share 4 librarieswith IDS3, who faces the most severe attack. While IDS1 andIDS2 sacrifice some performance, the overall performance issignificantly improved due to the dramatic improvement at theIDN bottleneck, IDS3 in this case.

B. Improvement against aggressiveness of the attacker

In this section, we examine the impact of the attacker’saggressiveness on the proposed algorithm. It is assumed thatIDS1 and IDS2 face attackers who can only launch one

IDS index1 2 3

Ave

rage

num

ber

of li

brar

ies

load

ed

0

1

2

3

4

5

6

With resource allocationWithout resource allocation

Fig. 4. Average number of libraries loaded by each IDS in the “115” case.TABLE II

THE IMPROVEMENT IN REWARD AGAINST AGGRESSIVENESS OFATTACKERS

Number of attackson IDS3

3 4 5 6

Improvement 31% 56% 85% 120%

attack, but IDS3 faces a more aggressive attacker who canlaunch multiple attacks. It is shown in Table II that whenIDS3 faces a more aggressive attacker, the collaboration gainin terms of improvement in the overall average accumulatedreward becomes more significant. Intuitively, when one of theIDSs faces more attacks, it is more important for this IDS toacquire help from others.C. Improvement with respect to detection probabilities

In this section, the impact of detection probabilities of librariesagainst attacks (i.e., q1 and q2) are examined.1 Again, assumethat IDS1 and IDS2 face attackers who can only launch oneattack at a time, and IDS3 faces an attacker who can launch5 attacks simultaneously. Fig. 5 shows that the performance ofthe proposed algorithm degrades when q2 increases. This maybe explained as follows: when q2 is larger, the probability ofdetecting attacks with mismatched libraries is increased, thusreducing the need for collaboration among IDSs. The impactof q1 seems more complicated as it is dependent on q2 anddeserves further exploration. Nonetheless, the proposed algo-rithm outperforms the original Nash-Q Learning Algorithm forall the q1 and q2 examined here.

VI. RELATED WORKS

In the past decades, various game theoretic approaches havebeen applied to facilitate predicting the behavior of intrudersand improve the detection performance of IDSs.

Early works in this direction often model the interactionbetween a single IDS and an intruder as a two-player game.These studies mainly focus on determining the optimal strate-gies of the IDS and predicting the possible actions of theintruder by analyzing the NE of the corresponding two-player

1Note that the learned strategies (i.e, πγ in (14)) depend on q1 and q2.The results in this section show the impact of q1 and q2 on the conditionsunder which the proposed algorithm leads to improvement.

q1

0.65 0.7 0.75 0.8 0.85 0.9 0.95

Impr

ovem

ent i

n re

war

d

20%

30%

40%

50%

60%

70%

80%

90%

q2=0.1

q2=0.2

q2=0.3

Fig. 5. The improvement in reward against q1 and q2 in the “115” case.

game. For example, in [11], a two-player, nonzero-sum, non-cooperative game is formulated to facilitate the IDS to form anoptimal defense strategy for the sensor network, based on thecorresponding NE. [12] studies a network environment wheremultiple intrusion detection techniques are deployed, and theNE was used to guide the IDS to choose among the differentintrusion detection techniques. In addition, considering that thetarget systems are often dynamic in practice, [13] formulatesa multi-stage dynamic intrusion detection game in which theIDS maintains and updates beliefs about the intruders by usingBayesian rules, and the resulting Perfect Bayesian equilibrium(PBE) specifies the best response strategies of the IDS ineach stage. Besides, SG has been employed in [4] to studythe IDS configuration problem. Our previous work [5] furtherconsiders incomplete information due to uncertainty in theintruder’s type.

In practice, multiple IDSs may be employed in a networkin which each IDS works independently but the performanceof each IDS is dependent on the well behaving of others. In[14], a multi-player stochastic nonzero-sum dynamic game isformulated to find the optimal defense strategy. Collabora-tive intrusion detection networks have also been consideredin literature (e.g., [6, 15, 16]). In an IDN, each IDS willshare information with others and benefit from the collectiveknowledge and experience shared by the peers. To overcomethe system vulnerability to malicious peers, some trust man-agement models and incentive designs have been proposedin [8, 17, 18]. However, none of these works considers thefollowing scenario: in an IDN, when the system hosting theIDS is in the ill state, the amount of resources available tothis IDS will be constrained so that it cannot run detectionalgorithms effectively to deal with the potential attacks. In thiswork, computational resource sharing in consultation basedIDNs is considered. Particularly, the IDSs in the network sendconsultation requests to others and “buy” resource when theypredict severe potential attacks. In addition, VCG auction isused to approach a socially optimal outcome.

VII. CONCLUSIONS AND FUTURE WORKS

In this work, the collaborative IDS configuration problem istackled through a two-layer SG approach, and a new algorithm

is proposed. Analytical and simulation results show that theproposed algorithm can provide an effective collaborative con-figuration scheme in many scenarios of interest. One limitationof this work is that it is assumed that all the IDSs will report toan IDN manager, which will solve a number of optimizationproblems in each round. In practice, this may impose a heavycommunication and computation burden to the system. Thus,generalizing the proposed method to a distributed mechanismremains an interesting future work. In addition, in this workit is assumed that all the IDSs in the IDN behave honestly.Designing a robust collaborative IDS configuration mechanismin the presence of malicious IDSs and free-riders constitutesanother interesting direction for future work.

REFERENCES

[1] T. F. Lunt and R. Jagannathan, “A prototype real-time intrusion-detectionexpert system,” in IEEE Symposium on Security and Privacy, Oakland,CA, Apr. 1988.

[2] M. H. Manshaei, Q. Zhu, T. Alpcan, T. Basar, and J.-P. Hubaux, “Gametheory meets network security and privacy,” ACM Comput. Surv., vol. 45,no. 3, pp. 25:1–25:39, 2013.

[3] W. He, C. Xia, H. Wang, C. Zhang, and Y. Ji, “A game theoreticalattack-defense model oriented to network security risk assessment,” inProc. of IEEE Computer Science and Software Engineering, Wuhan,Hubei, Dec. 2008.

[4] Q. Zhu and T. Basar, “Dynamic policy-based IDS configuration,” inProc. of IEEE Decision and Control, Shanghai, Dec. 2009.

[5] X. He, H. Dai, P. Ning, and R. Dutta, “Dynamic IDS configuration inthe presence of intruder type uncertainty,” in Proc. IEEE GLOBECOM,San Diego, CA, Dec. 2015.

[6] V. Yegneswaran, P. Barford, and S. Jha, “Global intrusion detection inthe DOMINO overlay system,” in ANDSSS, 2004.

[7] Q. Zhu, C. Fung, R. Boutaba, and T. Basar, “A game-theoretical ap-proach to incentive design in collaborative intrusion detection networks,”in Proc. International Symp. Game Theory Netw., Istanbul, May. 2009.

[8] C. J. Fung and R. Boutaba, “Design and management of collaborativeintrusion detection networks,” in IFIP/IEEE International Symposiumon Integrated Network Management, Ghent, May. 2013.

[9] A. Mas-Colell, M. D. Whinston, and J. R. Green, MicroeconomicTheory. Oxford University Press, 1995.

[10] J. Hu and M. P. Wellman, “Nash Q-learning for general-sum stochasticgames,” J. Mach. Learn. Res., vol. 4, pp. 1039–1069, 2003.

[11] A. Agah, S. K. Das, K. Basu, and M. Asadi, “Intrusion detection insensor networks: a non-cooperative game approach,” in Proc. IEEEInternational Symposium on Network Computing and Applications, Aug.2004.

[12] Y. Liu, H. Man, and C. Comaniciu, “A game theoretic approach toefficient mixed strategies for intrusion detection,” in Proc. IEEE ICC,Istanbul, June. 2006.

[13] S. Shen, “A game-theoretic approach for optimizing intrusion detectionstrategy in WSNs,” in Proc. IEEE AIMSEC, Deng Leng, Aug. 2011.

[14] Q. Zhu, H. Tembine, and T. Basar, “Network security configurations:A nonzero-sum stochastic game approach,” in Proc. of IEEE ACC,Baltimore, MD, June. 2010.

[15] C. V. Zhou, S. Karunasekera, and C. Leckie, “A peer-to-peer collabora-tive intrusion detection system,” in Proc. ICON’05, Nov. 2005.

[16] Y.-S. Wu, B. Foo, Y. Mei, and S. Bagchi, “Collaborative intrusiondetection system (CIDS): a framework for accurate and efficient IDS,”in Proc. Computer Security Applications Conference, Dec. 2003.

[17] C. J. Fung, J. Zhang, I. Aib, and R. Boutaba, “Robust and scalabletrust management for collaborative intrusion detection,” in IFIP/IEEEInternational Symposium on Integrated Network Management, LongIsland, NY, 2009.

[18] Q. Zhu, C. Fung, R. Boutaba, and T. Basar, “Guidex: A game-theoreticincentive-based mechanism for intrusion detection networks,” IEEEJournal on Selected Areas in Communications, vol. 30, no. 11, pp. 2220–2230, 2012.

Documents

Collaborative IDS Conﬁguration: A Two-layer Game-Theoretical Approachhdai/GC16-RJ.pdf · 2016-09-01 · Collaborative IDS Conﬁguration: A Two-layer ... select its detection libraries