05645253

Embed Size (px)

Citation preview

  • 8/3/2019 05645253

    1/8

    On the use of Innate and Adaptive parts of Articial Immune Systems for

    Online Fraud Detection

    R Tw AK NDepartment of Computer Sience, Liverpool Hope Universi

    {huangr, tawkh, nagara}@hope.ac.uk

    brc

    This paper describes a hybrid model for online

    aud detection of the Video-on-Demand System as

    an E-commence application, which combines

    algorithms fom the main to distinct viewpoints of

    the se non-self theo and danger theo. Ourartcial immune based algorithm includes the

    improved version of negative selection calledConserved Se Patte Recognition Algorithm

    (CSPRA) and a recently established algorithm

    inspired by Danger Theo (D called Dendritic

    Cells Algorithm (DCA). The experimental results

    based on our Video-on-Demand case study

    demonstrate that the hybrid approach has a higher

    detection rate and lower false alarm when comparedwith the results achieved by only using CSPRA or

    DCA as individual algorithms.

    nroducon

    Articial immune systems are one of the mostrapidly emerging biologically motivated computingparadigms. In the last 15 years, researchers have usedclassical immunology concepts to develop a set ofalgorithms, which include negative selection [1],immune networks [2], clonal selection [3] fromadaptive part of the immune system, and dendritic

    cells algoritm [4] om the innate part of the immesystem. Sice tii mmune System (AS) arestill relatively young and the natural immune system(NIS) is one of the most complex systems underactive study by biologists, there are some distinctviepoints about the main goal of the NIS. Theseideas and derstandings are extremely importt forAIS researchers and designers.

    The main two distinct viewpoints are between self,nonself theory and danger theory. The classicalimmunology stipulates that an immune response is

    //$ I

    669

    triggered when the body encounters something nonself or foreign [5]. This viepoint is generallyaccepted by immunologists, and the models are

    created by AIS researchers based on this approach. Alot of question marks arise from this viewpoint, and anew theory called Danger Theory has been developed.The main idea behind danger theory is that theimmune system does not respond to nonself but todanger. Similarly like the self nonself theories, itndamentally supports the need for discrimination.However, it diers in the answer to what should beresponded to. Instead of responding to foreignness,the immune system reacts to danger [6].

    The Danger Theory does not deny the existence ofa self nonself mode. Instead it provides a way ofescaping the semantic difculties with self and nonself providing a grounding method for the immuneresponse. There is a subclass of AIS algorithms calledhybrid AIS algorithms which integrate the innate andadaptive parts of NIS mechanisms. But there are veryfew implementations of such algorithms.

    Current aud detection algorithms include auditing[7], expert system [8], zzy logic [9], neuralnetworks [10], patte recognition [11], statistics [12],decision tree [13], and regression [14]. Thesealgorithms only focus on highly specic tpes ofaud detection and do not attempt to implement anextensible approach with the capability of preventing

    different kinds of fraud. Hence there is a need for thedevelopment of more ecient techniques for auddetection. This paper aims to introduce a hybridarticial imme inspired approach to detect onlinefraud in the ideoonDemand system. The approachwill combine the Dendritic Cell Algorithm (DCA), analgorithm inspired by DT and Conserved Self PatteRecognition Algorithm (CSPRA) om self nonselftheory with the help of the behavioural engine basedon Classication and Regression Trees (CART). Bycombining these two concepts, their strengths can be

  • 8/3/2019 05645253

    2/8

  • 8/3/2019 05645253

    3/8

    Since the user's behaviour varies with time, anexponential trace memory to maintain a movingaverage of pass input is obtained using equation (1).

    The moving average is calculated for allparameters over a congurable time interval groupingevents together. The congurable J allows for therepresentation of averages spanning several timeintervals. The higher the value ofJ the lower theinuence of input om previous intervals on thecurrent input pattes. A value of 0.7 was used in thepresent detection tests, which proved to provide asufcient decay rate for the intended purose of the

    proposed model.The sum of all input events (over an interval of1440 minutes 24 hours) was fed to the inputs of thesystem. This allowed the proposed detection schemeto detect aud with a granularity of 24 hours, whichshould be sufcient for most service environmentsusing aud detection. The selected events fordetection techniques are shon in Table 1.

    Input

    1

    234567

    Table 1. Selected Input Attributes

    Attributes

    Sum of failed login attempts

    Sum of successl login attemptsSum of failed movie orderSum of successl movie ordersSum of deliver noticationsSum of billing noticationsRatio of upload & downloaded

    The simulation resulted in synthetic data for 7months containing six hundred normal users. Breakinfraud cases were injected into the system, whichcontained 100 breakin fraudsters.

    3.3 Dendritic Cell Algorithm

    The idea of DCA is to correlate and disparate datastreams in the form of antigen signals and labelgroups of identical antigens as normal or anomalous,with no training required [4]. The selected attributeswill be categorized into three different signalsincluding Pathogen Associated Molecular Pattes(PAMP), Danger, and Safe signal. The DCA will beable to classi the current data into Low, Medium,and High level of danger.

    67

    These three signals for each signal is a separatenction performed on the inputs to produce theappropriate outut cytokines: costimulatory

    molecules (CSM), mature (M) and semimature (SM).Equation (2) borrowed om [4] determines a naldecision by assigning a different weight to each signal.

    o _(Wp*Cp)+(Ws*CS)+(WD*CD) (2)[c] -W+!WS+W

    Where p s D p s D are weights and

    concentrations of PAMP, Danger, Safe signalsrespectively.

    When the outut of CSM exceeds its onthreshold, immature DC will move to maturationlevel. The overall context is termed as safe if the

    output of SM is greater than M and viceversa.Each antigen will be sampled multiple times inorder to appear in different contexts. A maturecontext tigen value (MCA) is calculated for eachantigen by dividing the number of times that antigenhas appeared in the danger context by total number ofits appearances. Finally, a threshold is applied toMCA to make the nal decision. The pseudo codefor DCA is depicted in Algorithm (1).

    Samples StageFOR Cycle 1 :Max_DC_Cycle

    FOREACH DC in population

    Sample ntigens from Pool & Store itCalculate Signals for each sampled antigen & store itCalculate output CSM, SM, M using Equation (1)Add CSM, SM, M to total CSM, SM, M respectively

    IF total CSM > ThresholdIF total SM > M, add DC to SM population

    ELSE, add DC to M populationENDFOR

    ENDFOR2: Analysis StageFOREACH ntigen in SM & M population

    Calculate No. of times appear in SM & M populationIF SM/(SM+M) > MCAC

    ntigen is Fraud User

    ELSE Antigen is Normal UserAlgorithm 1. Pseudo code for DCA

    3.4 Conserved Self Pattern Recognition

    Algorithms

    CSPA is inspired by the biological PatteRecognition Receptor (PRRs) model published byJaneway [21]. PRRs can be viewed as an improvedversion of the negative selection suggesting that the

  • 8/3/2019 05645253

    4/8

    Atige Presetig Cells (APCs) ca recogizeevolvig pathoges.

    CSPRA will select parts of data from ormal users

    to geerate detectors i order to distiguish thecurret data as self or oself data with the help ofthe APC detector.

    At the traiig stage, it leas about ormalbehaviour i the system. At the ed of this stage, thesysm wil b ab select the cosved pttad geerate a APC detector. Also the system rusthe egative selectio process ad creates itsdetectors.

    Durig the detectio stage, the detectors geeratedby egative selectio ad the APC detector worktogether to check if ewly collected atigesrepreset the behaviour of good or bad users. The

    Euclidia distace rule is used for afitymeasuremet. Algorithm (2) depicts the pseudo codefor CSPA.

    I: Training StageGenerate T detectrs using negative selectin & APCdetectrT1: threshld fr self dataT2: threshld fr T detectrsT3: threshld fr APC detectrT4 thrshd fr usiciu antigen2: Detectin Stage

    FOREACH AntigenCOMPUTE dist(T) with T detectrs, dist(S) with Self data

    IF dist(T) < T2 & dist(ST1IF dist(T) > T4COMPUTE dist(C) with APC detectrIF dist(C) >T3, Antigen is nnself data

    ELSE Antigen is undetected dataELSE Antigen is nnself data

    IF dist(TT2 & dist(S)

  • 8/3/2019 05645253

    5/8

    4.1 Response to Danger by DCA

    DCA did not require training, but has a large

    number of parameters and stochastic elements. Thell training data set was used for setting theparameters. A suary of the experiments performedon each parameter is listed in Table 2. Unless stated,all other parameters are shon in Table 3 Theweights for signal processing dened in Table 4 werederived om empirical biological data [4].

    Table 2 Experiment cdes and settings

    Parameters Parameter ValuesNumber f ReceptrNumber f DCs

    Migratin Range

    I; 2;;2;;; ; 7

    -; -; -; -;-ntigen Multiplier ; ; 7; ; 2

    Table Default parameter setting fr DCA

    Parameters ValuesNumber f ntigen ReceptrNumber f DCs

    I

    -

    Migratin Threshld RangeNumber fAntigen Multiplier

    SignalsPAMP

    Danger

    Table 4. Default weights fr nctin 2

    Weight CSM SM M

    Wp 2 2

    WD I

    Ws

    Table Signal Mapping fr DCA

    Denitin f the listed signalsA P AMP signal is a strng indicatr f a pathgenic presence. A strng indicatin fbreakin culd be the number f the mvies theuser tries t rder. Several factrs culd result in

    rder failures, yet there is a higher prbability that a audter will fal mre en thn anauthentic user when tring t rder mvies.

    A Danger Signal may r may nt indicate ananmaly. Hwever the prbability f ananmaly is higher than nrmal. The number fsuccessl rders is cnsidered as a dangersignal. In nrmal circumstances, successl mvie rders shuld nt exceed a certainnumber; a breakin fraudster will try t rder asmany mvies as pssible successlly. Therefre

    673

    Safe

    the number f successl rders in a certain timeperid fr a fraudster will be higher than that fa nrmal user.

    A Safe Signal almst certainly indicates that nanmalies are present. The rst tw areassciated with lgin; the number f times theuser fails t lgin and the number f successllgins. The third ne is the number f billing nticatins. The number f successl billing nticatins will be less than a nrmal user innrmal circumstances.

    Data will be normalized within a range of 0 to 1,based on maximum values derived in preliminaryexperiments and the signal mapping is shon in Table Experiment results on the number of cells showed

    that once the cell population drops below 300 andabove 00, differences are shon in the detection rateand false alarm rate. In the present case, the numberof DCs should be dened between 300 and00

    Each antigen is copied multiple times in the system.The classication decision is the average value overthe replicated population. According to theexperimental results, the antigen multiplier did notenhance the performance of the DCA consequently.In the present case study a multiplier between 0 and90 gave better accuracy.

    Using equation 3, the DC sampling for 2iterations when the signal strengths are half of the

    expected total input signal maximum is equated. Thisis assigned to the median value of migration thresholdrange which is 108 in this case. The range ofmigration threshold slightly below the median valuegave a better performance than the one above. Thebest migration threshold was identied to be between0.4 and10

    m_ value= 0.5*((rxp * W,J+ rxd * Wd".+ rx, * W ,J (3)

    The number of DC antigen receptors means thatthe number of antigens of each DC can be sample periteration. The default value of antigen receptor which

    is I exhibited a better result as an increase in thenumber of receptors derase th dttion rate.

    The best parameter settings are shown in Table 6which result in better performance. Using these athreshold value (MCA V) of 068 for fraud user wasestablished. The moving average of pass input inequation (I) gave a similar effect of moving timewindows for DCA.

  • 8/3/2019 05645253

    6/8

    Table 6. Best arameters in DCA fr breakin fraud

    Parameters

    Number f Signals categriesNumber f CellsNumber f DC antigen recetrsMax CycleAntigen MultilierMigratin Threshld Range

    Value

    3300

    10050

    0.51.2

    4.2 Response to Non-self by CSP

    A range of experiments of the CSPRA aredecided carelly in order to evaluate the performanceof the algorithm and analyze the experimental results.Firstly, the conserved self patte needs to be selectedcorrectly. The Pearson Product Moment CorrelationCoefcient is the most widely used measurement ofcorrelation [23]. The computational formula is shownbelow in equation (4):

    r= NLXY-{xY)JNLX2 -{Y JNLy2 -{Y(4)

    Input 3 and 4 om Table 1 had the highestcorrelation so are selected as the conserved selfpattes. An antigen which is detected as a suspicious

    one by negative selection will be costimulated by theAPC detector. For the APC detector, the max, min,and mean of all the values in the training data needstyo be calculated. This can be represented as follows:

    {(3, 0, 18, 1.0104), (4, 0, 33, .3238)}

    The distance between a suspicious antigen p andAPC detector d is calculated using equation (5):

    w Ipi dilDist(p, d = L . .

    l ml-l(5)

    Where w s the umber of dimesios fr the

    conserved patte; m i and n represent the lower andupper bounds of the i attribute; Pi is the value of iattribute to be examined; d is the mean of all thevalues in the i column.

    Aer selecting the conserved patte, data will benormalized within a range of 0 to 1 based onmaximum values derived in preliminary experiments.The effect of variation number of the T detector, data

    67

    dimensionality, treshold of self data, T detector andAPC detector are examined. Unless stated, all otherparameters are set to the default values shown in

    Table 7. CSPA only requires self data for training.Only 70% of the normal data will be used for trainingand the remaining normal data and aud data will beused for testing.

    Table 7. Exeriment cdes and settings fr CSPRA

    Parameters Parameter ValuesDefault ValueData dimensinality 2; 3;4;5 4 Number f Detectr 250;500;750; 1000 500Self threshld 0.04 t 0.5 0.4

    APC threshld 0.1 t 1.5 0.8 Tdetectr threshld 0.2 t 2 0.6SusEiciousthreshld 0.1 t 1 0.4

    Experimental results showed that, when the spacedimensions increase, the number of samples whichwere not detected also increased. Yet there was also adecreasein the false negative numbers. By balancingthe dimensions and the detection rates, the input from1, 2, 5, and 6 in Table 1 gave the better results. Thesize of the T detector has only a small impact on thedetection rate and the false alarm rate when thenumber is greater than 500. Having more detectorsdid not guarantee a better detection rate. In this case,the T detector size and the training data set sizeshould be similar.

    Results also showed that the self data, T detectorand APC detector thresholds have a signicant effecton the performance of the system. For the T detectorthreshold, the detection rate and the false alarm rateincreased when the threshold increases. When theAPC detector threshold is different, the detection rateand the false alarm rate decreases as the thresholdincreases. Similarly for the threshold of self data, thefalse alarm rate increased when the thresholddecreased. The experimental results show that theparameters in the Table 8 produced the bestperformance.

    Table 8. Best arameters setting in the exeriments

    Parameters Va

    Self detectr threshld 0.3 T detectr threshld 0.8APC detectr threshldTdetectr size 700 Threshld fr susicius antigen 0.5Dimensinality f data sace 4

  • 8/3/2019 05645253

    7/8

    4.3 Results for Hybrid Approach

    The hybrid algorithm is used for combining

    multiple outputs of different models in order toachieve more reliable decisions and also increase thepredictive performance over a single model. Bycombining the results of CSPA and DCA, thesystems will have the ability to classi the user intodifferent categories as shon in Table 9. There willbe 6 different combinations of output provided by theclassication. The rles based on the decision willdetermine whether the transaction of the user shouldbe accepted, rejected or suspended for rtheralysis.

    The suspended cases will be accepted by thesystem if the behavioural engine classies it as a

    normal user. Else the suspended cases will be rejectedd the system classies it as a audulent user.

    In the CART model for the behavioural engine,selecting the correct splits which determines when tostop will affect the predictive accuracy. The resultsshow that Gini index for impurity measure andMinimum n to stop the splitting gave the bestperformance compared with entropy andmisclassication.

    Table 9. Decisin rles fr DCA &CSPRA

    DCA CSPRA Decisin

    Lw Self AccetHigh Self SusendLw Nnself SusendHigh Nnself RejectLw Undetected SusendHigh Undetected Reject

    In the test, the best parameters for DCA andCSPRA can be borrowed for the hybrid approach.The testing experiment is based on randomlyselecting 2000 new samples that also follow adistribution of 50% normal and 50% aud data. Therst 1000 samples form the testing data set 1, and the

    remaining 1000 forms the testing data set 2.

    Table 10. Testing results fr different araches

    Methds TP% TN% FP% FN%DCA 76 83 17 24CSPRA (8% undetected) 68 81 15 28CART 78 86 14 22Hybrid Algrithm 85 92 8 15

    675

    Table 10 highlights the testing results for differentclassication methods. The average score for the 2testing data sets in 20 rns is shon. The ordinary

    performance of the CSPRA is possibly due to the factthat the denition of self and no-self are not stable,and the inability to classi the samples when they areundetected by detectors. DCA can take care of nonself in samples with low levels of danger fromfraudulent users and self in samples with high level ofdanger. Yet these can lead to a large number of falsepositives nd false negatives. These results show theusage potential of DCA and CSPRA for auddetection in an online environment.

    Our hybrid algorithm demonstrates improvementsover the DCA and CSPRA with the help of the CARTmodel. It shows a signicantly better positive

    predictive value for fraud detection than what isachieved by the 2 individual AIS approaches whenapplied on the same data.

    5. oncluon

    In this paper, a hybrid model for online frauddetection for an Online Video-on-Demnd System isproposed. The model combines two articial immunesystem algorithms with behaviour based frauddetection using the CART model. Based on theexperimental results, the proposed methoddemonstrated higher detection rate, lower false alarmand handled a high dimensional data set better whencompared with the results achieved by using DCAand CSPA individually.

    The present study contributes to research in thefraud detection and prevention area by suggesting thathybrid algorithms can exhibit better performance foronline fraud detection.

    There are several opportunities for ture work ofstemming from the present research since theproposed framework can be adapted and reapplied forother fraud detection scenarios such as terroristdetection, nancial crime detection, intrsion andspam detection.

    frnc

    [1] Frrest, S, Perelsn, A, Allen, L and Cherkuri, R(1994). "SelfNnselfDiscriminatin in a Cmuter. Prc.f IEEE Symsium n Research in Security and Privacy.Oakland, USA, . 202212.

  • 8/3/2019 05645253

    8/8

    [2] Jee, N. (1974) "Towards a network theory of theimmune system, nnals of Immunology, vol. 125, pp.373-389)

    [3] de Castro, L. N. and F. J. Von Zuben. (2000) "Theclonal selection algorithm with engineering applications.In Proceedings of GECCO 00. Workshop on ArticialImmune Systems and Their Applications. Pages 3637

    [4] J. Greensmith, 1. Twycross, and U. Aickelin, (2006),"Dendritic cells for anomaly detection, In IEEE Congresson Evolutionary Computation (CEC 2006), PP. 664671.

    [5] Perelson, A. S. & Weisbuch, G. (1997). "Immunologyfor Physicists, Rev. of Mode Physics, 69(4), pp. 12191267.

    [6] P. Matzinger, (2002), "The Danger Model: A RenewedSense of Sel', Science, 296, pp. 301305.

    [7] Qian Liu, Tong Li, and Weixu, (2009) A subjective andobjective integrated method for aud detection in nancialsystem, In Proceedings of Machine Leaing andCybeetics, pp. 13391345.

    [8] Rozsnyai, S, Schiefer, J, and Schatten, A, (2007)Solutionarchitecture for detecting and preventing fraud in real time, n procding of Digital InformatinoManagement, ICDIM'07, pp. 152158.

    [9] Wei Chai, Hoogs, B.K., Verschueren, B.T., (2006)

    Fuzzy Ranking of Financial Statements for Fraud detection,In proceeding of Inteational Conference on Fuzzy System,pp. 152158.

    [10] Tao Guo, GuiYang Li, (2008) Neural data mining forcredit card fraud detection, In proceeding of InteationalConference on Machine Leaing and Cybeetics, pp.36303634.

    [11] Jianyun Xu, Sung, A.H., Qingzhong Liu, (2006) TreeBased Behaviour Monitoring for Adaptive Fraud Detection,In proceeding of Inteational Conference on PatteRecognition, pp. 12081211.

    [12] Z. Ferdousi, and A. Maeda, (2006) UnsupervisedOutlier Detection in Time Series Data, In Proceedings of22nd Inteational Conference on Data EngineeringWorkshops.

    [13] E. Kirkos, C. Spathis, and Y. Manolopoulos, (2007)Data mining techniques for the detection of fraudulentnancial statements, Expert Systems with Applications, pp.23 32.

    [14] C. Spathis, (2002) Detecting false nancial statements using published. data: some evidence from Greece,Managerial Auditing Joual, vol. 17, no, pp. 179191.

    66

    [15] J. Twycross and U. Aickelin, (2007). "An Immuneinspired Approach to nomaly Detection. Chapter in

    handbook of Research on Information Assurance and

    Securit Idea Publishing Group.

    [16] 1. P. Twcross. (2007). "Integrated Innate and Adaptive Articial Immune Systems applied to Process Anomaly Detection. PhD Thesis, University of Nottingham, UK.

    [17] Secker, A. Freitas, and J. Timmis, (2003). "AISEC: An Articial Immune System for Email Classication. In Proceedings of Congress on Evolutionar Computation,Canberra, IEEE, pp. 131139

    [18] M. Ayara, J. Timmis, R. deLemos, And S. Forrest.(2005). "Immunising Automated Teller Machines. In

    Proceedings of the 4 Inteational Conference onArticial Immune Systems, Banf, Canada. Pp. 404417.

    [19] D. L. chao and S. Forrest. (2002). "InformationImmune Systems. In the Proceedings of the 1stInteational Conference on Articial Immune Systems. Pp.132140, Canterbury, England.

    [20] Emilie Lundin, Hakan Kvastr om and ErlandJonsson. A synthetic fraud data generation methodology.(2002), In Proceedings of the Fourth InteationalConference on Information and Communications Security(ICICS 2002), Singapore, Volume 2513 of Lecture Notesin Computer Science, SpringerVerlag.

    [21] Senhua Yu and Dipankar Dasgupta (2008)," Conserved Self Patte Recognition Algorithm, ICARIS,LNCS 5132, PP. 279290.

    [22] Breiman, L., J. H. Friedman, R. A. Olshen, and C. 1.Stone, (1984). "Classication and regression trees.Monterey, Calif., USA: Wadsworth.

    [23] D. Moore. (2006). "Basic Practice of Statistics. W. H.Freeman, San Francisco, Calif., USA.