SN08

Embed Size (px)

Citation preview

  • 8/2/2019 SN08

    1/7

    A Weak Process Approach to Anomaly Detection in

    Wireless Sensor Networks

    Marco PuglieseCenter of Excellence DEWS

    University of LAquila

    LAquila, Italy

    [email protected]

    Annarita GianiDep. of Electrical Engineering and Computer Sciences

    University of California at Berkeley

    Berkeley, CA, USA

    [email protected]

    Fortunato SantucciCenter of Excellence DEWS

    University of LAquila

    LAquila, Italy

    [email protected]

    AbstractThis paper introduces a novel methodology formodeling Anomaly Detection Logic (ADL) in intrusion detectionfor Wireless Sensor Networks (WSNs). In particular, we pro-pose an integrated approach to threat modeling and detection

    using the Weak Process Model (WPM) paradigm. WPMs arenonparametric versions of Hidden Markov Models (HMM) inwhich states transition probabilities are reduced to 0-1 rules ofreacheability.

    WSNs are resource-limited; WPMs are robust while consumingfew resources. Threat identification is performed by introducing athreat score, which weights the states sequence (hypothetic statestraces) associated with each threat model, while anomalies areclassified using predefined rules. The higher the threat score, thehigher the likelihood of an attack. Attacks have been classifiedaccording to their risk potentiality: Low Potential Attacks (LPA)and High Potential Attacks (HPA). An alarm is issued when atleast one HPA is likely to occur.

    We present a theoretical review of our methodology and reportexperimentally obtained quantitative data

    I. INTRODUCTION

    Sensor networks revolutionize the way we interact with the

    physical environment. A sensor network is often subject to

    a unique set of constraints, such as finite battery power and

    limited communication bandwidth. Furthermore, transferring

    data to a processing unit requires the existence of a secure

    communication channel. Wireless Sensor Network are flexible,

    can be quickly implemented and have low cost of opera-

    tion. If well designed and appropriately deployed, a WSN

    can become a secure communication platform for gathering

    sensory data. To be a "secure platform", it must implement

    specific functional modules, typically software applications,in a distributed computing architecture. Assuring security in

    a WSN depends on successful execution of several intercon-

    nected components: ciphering and authenticating data flows

    0This work has been partially supported by the EU FP6 NoE HYCON andis part of the project WINSOME at DEWS and by TRUST (Team for Researchin Ubiquitous Secure Technology), which receives support from the NationalScience Foundation (NSF award number CCF-0424422) and the followingorganizations: AFOSR (#FA9550-06-1-0244), BT, Cisco, ESCHER, HP, IBM,iCAST, Intel, Microsoft, ORNL, Pirelli, Qualcomm, Sun, Symantec, TelecomItalia, and United Technologies.

    (cryptography), detecting malicious intrusions coming from

    external or internal attackers, assuring correct functionality of

    the sensor nodes(integrity).

    An Intrusion Detection System (IDS) is a defense system,which detects hostile activities in a network. It recognizes

    patterns of known attacks (signature based), or identifies

    abnormal network activity that differ from historical norms

    (anomaly based) [1]. Model based reasoning combines models

    of misuse with reasoning to support conclusions about an

    attack. Recently, IDSs have been developed for the use on

    wireless networks. A Wireless IDS is similar to a standard,

    wired IDS, but contains additional deployment requirements

    as well as unique features specific to WLAN intrusion and

    misuse detection [2].

    This work considers an IDS based on Anomaly Detection

    Logic (ADL). The architecture taken as reference for this

    work is derived from, and compliant with a wide variety ofaccepted standards, and can be decomposed into the following

    components (see Fig. 1):

    Intrusion Alarm Generation (IAG): generates alarms

    based on ADL.

    Intrusion Reaction Logic (IRL): defines the defence strat-

    egy (determines the subset of nodes to be protected) and

    tracks correlated alarms.

    Intrusion Reaction logic Application (IRA): reacts to

    intrusion by deploying appropriate countermeasures (re-

    leasing links, putting compromised nodes in quarantine,

    distributing black lists / grey lists).

    The database Local Configuration Data (LCD) refers to theset of configuration data essential for any operation of the

    sensor node in the WSN: e.g. LCD contains data related to

    cryptographic key generation and to node authentication pro-

    cedures [3]. As it will be shown later, LCD is the fundamental

    component for determining if an anomaly is present on the

    incoming data traffic.

    The remainder of this paper is organized as follows. Sec-

    tion II presents a review of current intrusion detection tech-

    niques applied to WSN. Section III introduces threat modeling

  • 8/2/2019 SN08

    2/7

    2

    Figure 1. ADL-based IDS macro-functions.

    using Weak Process Model approach. Section IV describes

    a method for performing threat identification. Section V de-

    scribes the functional architecture for the Anomaly Detection

    Logic module of IDS, as well as the anomaly rules and the

    internal blocks of the threat model. In Section VI the proposed

    scheme is applied to a specific threat (HELLO flooding).

    Section VII contains concluding discussion and future work.

    I I . MOTIVATION

    A threat can be defined as a particular strategy in engaging

    an attack. Typical threats affecting WSNs are reported in [4].

    WSNs are resource limitated: memory, energy and accessi-

    bility constraints make the use of current security techniques

    challenging [5].

    Traditional intrusion detection techniques cannot be well

    implemented on a WSN, they are not effective, too limited

    to a restricted number of threats, or too restrictive: cross-

    correlating aggregated data (among all analyzing fluctuations

    in sensor readings [6], watching over the information in-

    terchange [7], modeling normal traffic to detect abnormaltraffic patterns [8]), or running the Baum-Viterbi algorithm

    as likelihood criterion in threats identification when HMM are

    used to model threats [9], are some examples.

    The issue of implementing an effective IDS on a WSN leads

    to the problem of finding a trade-off between the capability of

    identifying threats (i.e. with a bounded false alarm rate), the

    complexity of the algorithms and memory usage [10].

    We believe that an anomaly-based IDS using weak process

    models applied to threat modelling with a weight-based logic

    for threat identification can be a promising solution for a

    resource constraint platform such as WSNs. First, a WPM

    simplifies HMM (ref. to Sec. III for details) leading to a

    simplified threat identification likelihood logic in comparisonwith the Viterbi algorithm. This simplification is done at the

    expense of sacrifying the capability of the intrusion system to

    reveal attacks with low probabilities. However increasing the

    number of observables associated to a state (ref. to Sec. III) can

    lead to better accuracy with an expense in memory occupancy

    but without increasing algorithmic complexity. Nevertheless

    given a fixed amount of resource, a larger number of WPM-

    based threats can be considered compared to HMM-based

    models. Second, a careful project design (adopting specific

    application architecture) and software engineering techniques

    (modular algorithms and code squeezing) are adopted, as will

    be shown in Sec. V.

    III. WEA K PROCESS FORMALISM FOR THREAT MODELING

    According to the definition given in Sec. II, threats can

    be modeled as random dynamical systems using stochastic

    Finite State Machines (FSMs). Threat detection by monitoring

    anomalous events from environment becomes equivalent to

    FSM structure identification by observing symbols related to

    the FSM states. A Hidden Markov Model (HMM) is a doubly

    stochastic FSM with an underlying stochastic process that is

    hidden but indirectly observable through another stochastic

    process that produce a sequence of observable symbols (ob-

    servables) [11], [12]. In HMMs, state transitions follow a first-

    order Markov chain, and an observable is emitted according

    to some probability distribution each time a state is entered.

    Anomaly Intrusion Detection can be treated as a classifi-

    cation problem. The use of Hidden Markov Model for thedetection of malicious computer and system intrusion is not

    new. The sequence of system calls is given as input to a

    system model. If the probability of the models specifying

    normal activity is less then a specific threshold, an alarm

    is triggered [13], [14]. Another approach that uses HMMs

    consists of modeling computer attacks and using system

    alerts as observables to find the most likely process [15].

    Nonparametric versions of HMMs are called weak models.

    Weak models are robust for process detection and easy to

    construct, as the assumption of knowing precise probabilities

    in HMMs is weakened to 0,1-values of reachabilities [16].

    A Weak Process Model [10] is an HMM where only the

    reachability between states and observables are specified (asopposed to transition and emission probabilities). By mapping

    these concepts to threat models, it follows that the existence

    of an observable can be considered as an indicator that an

    attack against the WSN may be occurring. The information

    about the current state of the attack is not available (it is an

    hidden state) but we observe a sub-set of choices. Fig. 2 shows

    a simple example of WPM where the green bordered node in

    the graph represents the initial state and the red one the final

    state. There are 6 observables and 5 states, and the mapping

    is: when event 1 is observed (or observable 1 occurs) then

    the FSM can be in state 1 or 5; when event 3 is observed

    (or observable 3 occurs) then the FSM can be in state 2, 4

    or 5. This mapping is represented as a graph by including theobservables into brackets for each state.

    A WPM, as any Markov model, can be formally represented

    using the canonical form:xk+1 = Axk

    ok = Bxk(1)

    where:

    States set X is defined by X = (x1, x2, . . . , x3) ;

  • 8/2/2019 SN08

    3/7

    3

    Individual state x at step k is defined by xk and x0 (k =0) is the initial state;

    Observable set O is defined by O = (o1, o2, . . . , oq) ; Individual observable o at step k is defined by ok;

    State Transition Distribution A defines the n n matrixrepresenting the behaviour of each threat. Matrix ele-

    ments are defined as:

    Ai,j =

    1, if p(xk+1

    = xj |xk = xi) = 10, otherwise

    Emission Distribution B defines the q n matrix rep-resenting the mapping between each observable and the

    sub-set of the possible states. Matrix elements are defined

    as

    Bi,j =

    1, if p(ok = oj |x

    k = xi) = 1

    0, otherwise

    Figure 2. Example of WPM with 6 observables and 5 states.

    IV. THREAT IDENTIFICATION AND ALARM GENERATION

    PROBLEMS

    Supposing the WSN is currently under attack. What is the

    criterion to identify the most likely threat currently occurring?

    This is the Threat Identification Problem. Another question is:

    supposing that threat has been identified, what is the criterion

    to state the hazard level of the attack and, according to

    it, the choice to issue or not an alarm? This is the Alarm

    Generation Problem, which impacts the reliability of the

    proposed mechanism (false alarms percentage).

    Our approach to answer these questions is to assign weights(or scores) to the output produced by each threat model when

    an observable has been applied as input: this output is the

    Hypothetic States Trace, as will be defined in the following.

    Definition (Hypothetic Engaged States at step k, Hpxk ).

    The Hypothetic engaged states at step k is the sub-set of pos-

    sible (hidden) states associated to the observable ok. Formally

    is Hpxk = BTok.

    Definition (Hypothetic Free States at step k, Freexk ). The

    Hypothetic Free States at step k is the sub-set of states not

    yet engaged at observable ok. Formally: Freexk = xk

    (xk Hpxk) where the symbol indicates the Boolean ANDbetween vector elements.

    Definition (Hypothetic States Trace at step k, Trk ). The

    Hypothetic States Trace at step k is the sequence of engaged

    states compliant to a state sequence in the FSM graph up to

    observable ok. Formally: Trk = k1

    i=1 (Hpkx AFreexi).

    Thus an Hypothetic States Trace is the output from a threatmodel when an observable has been applied as input.

    Weights are assigned to each trace by applying the n nscore matrix S, whose elements are defined:

    sij is the score assigned to transition from xj to xi sjj is the score assigned to state xj (e.g. the initial state

    x0 ).

    The following formula returns the score associated to the

    threat after the first k observable steps:

    sk =k

    j=1

    (Hpxj x0)TSx0 +

    j

    i=1

    T riT

    S Freexi (2)

    The idea behind this expression is to weight WPM transi-

    tions according to the topological positions of the starting and

    ending states involved in the transition. As will be shown later,

    this allows to straightforwardly define two different hazard

    levels of an attack (5). The threat score is given by the total

    sum for the sequence of hidden states (trace) up to the current

    observable: this definition gives us an expressive indicator of

    both threat identity (non-zero alarms for a certain threat model)

    and its related hazard level (the score value). It can be shown

    that the algorithmic complexity of equation (2) is O(k2) untilthe memory order of the underlying Markov model associated

    to the WPM (which coincides with the Local Buffer Memory,LBM, depicted in Fig.6, where the hypothetic free states are

    buffered, see Fig.7) has been filled up, i.e. for k < LBM;

    definitively, i.e. for k LBM , the complexity becomes O(k)because as a new observable enters in the model, the oldest

    hypothetic free states are discarded from memory and score

    is updated by sweeping just once the remaining free states. In

    the case k < LBM we have:

    sk =k

    j=1

    (Hpxj x0)TSx0 +

    ji=1

    T riT

    S Freexi (3)

    and in the case k

    LBM we have:

    sk = (Hpxk x0)TSx0 +

    LBMi=1

    T riT

    S Freexi (4)

    If the resulting score is decomposed in the following com-

    ponents:

    s = shpa + slpa (5)

    then shpa and slpa indicate two hazard levels:

  • 8/2/2019 SN08

    4/7

    4

    Definition (Low Potential Attack, LPA, slpa ). An attack

    can be considered "low potentially dangerous" if an hypothetic

    engaged state is at least 2 hops before the final state in the

    FSM graph representing the threat.

    Definition (High Potential Attack, HPA, shpa ). An attack

    can be considered "high potentially dangerous" if a hypothetic

    engaged state is a penultimate state in the FSM graph repre-

    senting the threat.Assuming that the buffer message is less than 100 positions,

    then the following score assignments can be adopted:

    The total score at each state at least 2 hop before the final

    state is 1 or multiples (thus 0 slpa 99 ); The total score at each penultimate state is 100 or

    multiples (thus 0 shpa 9900 ); If there are observables associated to the final state then

    the score -100 is assigned to each transition from any

    penultimate state to the final state.

    Alarms and the related scores are issued (Al[s]) when apenultimate state has been reached and an HPA is likely to

    occurre.

    The example reported in Fig. 3 provides a better understand-

    ing where, for simplicity, matrices A and B have not been

    represented. Given the WPM-based threat model depicted in

    Fig. 2, if the sequence of observables {3,1,4,2,5,6} is applied

    as input to the model, the Hypothetic State Traces 1-2-4-5 and

    1-3-5 are produced (see Fig. 3). Hypothetic states not in any

    trace have been barred. The application of Eq. (2) to traces

    generates the score values associated to each trace at every

    step k, with k = 1, 2, .., 6. If state 3 or state 4 is reached(respectively at step 2 and step 4) an HPAs, by definition, is

    likely occurring and alarms are issued with scores 101 and

    200 respectively. According to Eq. (2), the alarm semantic is:

    1 HPA +1 LPA at step 2, 2 HPA + 0 LPA at step 4.

    Figure 3. An example of Alarm Generation.

    Now suppose to feed the threats TM1 and TM2 represented

    in Fig. 4 with the following sequence of observables:

    {6, 4, 5, 6, 4, 5, 6, 3, 3, 2, 6, 4, 5, 3, 2, 4, 3, 6, 4, 5, 1, 1, 6, 4, 5, 2, 2, 6, 4, 5}(6)

    Simulations using MATLAB software tool provided the

    result reported graphically in Fig. 5 where horizontal axis is

    Figure 4. Threats TM1 and TM2 respectively.

    labeled with observation steps (k) and vertical axis with scores

    (s); orange bars refer to TM1 and blue bars to TM2.

    As already mentioned, indicators for threats identification

    and the related hazard level are given by the existence of

    a score associated to a threat model (score not zero) and

    the related values respectively: e.g. observation steps rangingfrom 14 to 18 show a majority of not-zero scores from TM1

    (with high hazard levels) identifying TM1 as the most likely

    attacking the network, while steps ranging from 24 to 30

    identify TM2.

    Figure 5. Alarms related to TM1 and TM2 when (6) is applied as input.

    V. INTRUSION ALARM GENERATION

    Fig. 6 represents the functional architecture of the Intrusion

    Alarm Generation module: internally the main blocks areAnomaly Detection Logic (ADL) and several of Threat Models

    (TM).

    ADL block implements the following functions:

    1) ADL block applies a predefined set of rules based on

    the Local Configuration Data, to determine if an incom-

    ing signalling message contains anomalies or not. The

    application of these anomaly rules give two possible

    results: "no anomalies", resulting in the message being

    processed further, or "anomaly type T", according to the

  • 8/2/2019 SN08

    5/7

  • 8/2/2019 SN08

    6/7

    6

    Step 3. From the previous results, the WPM-based HELLO

    Flooding FSM is represented in Fig. 9. The number of states

    is 4 and the number of observables is 4. the canonical form

    Eq. (1) can be specialized using matrices A and B in (7) and

    the score matrix S in (8).

    The threat behavior represented in Fig. 9 is easily described.

    If the observable 1 or 2 or 3 occurs then the threat is

    considered "started" and state HF1 defines an LPA. If nomore observables are identified in the following K steps (with

    K predefined threshold) then the threat is considered "reset",

    which means either "attack temporary suspended" or maybe

    there were no attack at all (HF3). If the observable 1 or 2

    or 3 occurs again then the attack is considered dangerous,

    state HF2 defines an HPA and an alarm is issued; if no more

    observables are identified in the following K steps then the

    attack is reset. The final state, SUCCESSFULLY ATTACK,

    is formally never reached so that alarm remains set until a

    suited countermeasure has been taken (in this case alarm is

    not considered anymore) or the threat returns reset.

    Figure 9. WPM-based HELLO Flooding Model

    A =

    0 0 0 01 0 0 01 1 0 00 1 0 0

    B =

    1 1 0 01 1 0 01 1 0 00 0 1 0

    (7)

    S =

    1 0 0 099 0 0 01 100 0 00 0 0 0

    (8)

    Now suppose to feed the HELLO Flooding threat model

    graphically represented in Fig. 9 with the following sequenceof observables in the hypothesis that the threshold K intro-

    duced in Table II is equal to 3.

    {2;2; ; ; ;3;1;2;3; ; ; ; 3; ;3;2;; } (9)

    where the symbol indicates any other observable differentfrom those listed in Table II. Simulations using MATLAB

    provided the results reported graphically in Fig. 10 and results

    confirm expectations: observable sequences 3-4-5 and 10-11-

    12 indicate an attack interruption and alarms have been reset;

    moreover the maximum value for alarm 301 (i.e. 3 HPA + 1

    LPA) is reached at the last observable in the sequence 6-7-8-9.

    Figure 10. Alarms related to Hello Flooding threat when (9) is applied asinput

    We are extending this approach to Sinkhole and Wormhole.

    In a Sinkhole the attacker emulates the sink and lures trafficfrom other nodes. In a Sinkhole an adversary receives packets

    at one location in the network and tunnels them to another

    location (low latency link) in the network, where the packets

    are resent into the network.

    VII. CONCLUSION AND FUTURE WOR K

    In this paper we propose a new approach to anomaly

    detection and alarm generation logic using Weak Process

    Model formalism, which is a simplified version of Hidden

    Markov Models. Weak models are both robust tools for process

    detection, and are easy to construct.

    The proposed anomaly detection logic for threat modeling,threat identification, and alarm generation has been validated

    using MATLAB simulations. Other threats are currently being

    modeled using the proposed formalism. The objective is the

    experimentation on a MicaZ cluster-based sensor network with

    a stepped implementation and deployment approach starting

    from few sensor nodes. Currently we are carrying on early

    experimentations using [17] on few MicaZ sensor nodes and

    have implemented a stub module into the middleware code for

    the application of the anomaly rules on incoming signalling

    messages [18]. Moreover we are packing ADL code for WSN

    development environment. A well-suited application scenario

    for WINSOME (WIreless sensor Network-based Secure sys-

    tem for structural integrity Monitoring and AlErting) project isbiomedicine, where WSN are deployed as body area networks

    and a sink node is integrated in a PDA. In such case,

    the security and reliability are absolute qualifying indicators

    for the effectiveness of the monitoring system. Cross-layer,

    adopted in WINSOME, is a key design approach for security

    management. In particular, a deep integration with the cryp-

    tographic scheme (specifically the authentication procedure)

    has been applied as input (compare Local Configuration Data

    defined in Section I with [3]).

  • 8/2/2019 SN08

    7/7