18
SoK: Applying Machine Learning in Security - A Survey Heju Jiang * , Jasvir Nagra, Parvez Ahammad * Instart Logic, Inc. {hjiang, jnagra, pahammad }@instartlogic.com ABSTRACT The idea of applying machine learning(ML) to solve prob- lems in security domains is almost 3 decades old. As infor- mation and communications grow more ubiquitous and more data become available, many security risks arise as well as appetite to manage and mitigate such risks. Consequently, research on applying and designing ML algorithms and sys- tems for security has grown fast, ranging from intrusion de- tection systems(IDS) and malware classification to security policy management(SPM) and information leak checking. In this paper, we systematically study the methods, algo- rithms, and system designs in academic publications from 2008-2015 that applied ML in security domains. 98% of the surveyed papers appeared in the 6 highest-ranked academic security conferences and 1 conference known for pioneering ML applications in security. We examine the generalized system designs, underlying assumptions, measurements, and use cases in active research. Our examinations lead to 1) a taxonomy on ML paradigms and security domains for fu- ture exploration and exploitation, and 2) an agenda detail- ing open and upcoming challenges. Based on our survey, we also suggest a point of view that treats security as a game theory problem instead of a batch-trained ML prob- lem. Keywords: Security, Machine Learning, Large-scale Applications, Game Theory, Security Policy Management 1. INTRODUCTION AND MOTIVATION Since Dorothy Denning’s seminal 1987 paper on intrusion detection [1], ML and data mining(DM) have steadily gained attention in security applications. DARPA’s 1998 network intrusion detection evaluation [2], and KDD(Conference on Knowledge Discovery and Data Mining) Cup’s 1999 chal- lenge [3, 4] have raised profile of ML in security contexts. Yet, constrained by hardware and system resources[4], large- scale ML applications did not receive much attention for many years. * Corresponding authors In 2008, ACM Conference on Computer and Communica- tions Security(CCS) hosted the 1st Artificial Intelligence in Security(AISec) Workshop, which has since been a dedicated venue at a top-level security conference for the intersection of ML and security. From 2008, the pace of research and publicity of ML in security started to accelerate in academic communities (section 2.3), and industry venues (e.g. Black Hat, RSA) also shifted interests. For instance, ML in secu- rity was still a topic of minority interest at Black Hat USA 2014 in August [5], but at RSA 2016 in February, the ma- jority of vendors claimed to deploy ML in their products [6]. A part of this shift may be motivated by the sudden increase in blackswan events like the discovery of CRIME, BEAST and Heartbleed vulnerabilities. The discovery of these vulnerabilities suggest that organizations may be at- tacked via previously unknown classes of attacks. To defend against these types of attacks requires monitoring not just for known vectors attacks, but also for behavior suggestive of a compromised machine. The latter requires the gathering and analysis of much larger sets of data. Advances in hardware and data processing capacities en- abled large-scale systems. With increasing amount of data from growing numbers of information channels and devices, the analytic tools and intelligent behaviors provided by ML becomes increasingly important in security. With DARPA’s Cyber Grand Challenge final contest looming [7], research interest in ML and security is becoming even more conspic- uous. Now is the crucial time to examine research works done in ML applications and security. To do so, we studied the state-of-art of ML research in security between 2008 and early 2016, and systematize this research area in 3 ways: 1. We survey cutting-edge research on applied ML in se- curity, and provide a high-level overview taxonomy of ML paradigms and security domains. 2. We point to research challenges that will improve, en- hance, and expand our understanding, designs, and efficacy of applying ML in security. 3. We emphasize a position which treats security as a game theory problem. 2. OVERVIEW: STRUCTURE & SCOPE While we realize there are different ways to classify existing security problems based on purpose, mechanism, targeted assets, and point of flow of the attack, our SoK’s section structure is based on the “Security and Privacy” category of 2012 ACM Computing Classification System[8], which is a 1 arXiv:1611.03186v1 [cs.CR] 10 Nov 2016

SoK: Applying Machine Learning in Security - A Survey · SoK: Applying Machine Learning in Security - A Survey Heju Jiang*, Jasvir Nagra, Parvez Ahammad ... tions Security(CCS) hosted

Embed Size (px)

Citation preview

Page 1: SoK: Applying Machine Learning in Security - A Survey · SoK: Applying Machine Learning in Security - A Survey Heju Jiang*, Jasvir Nagra, Parvez Ahammad ... tions Security(CCS) hosted

SoK: Applying Machine Learning in Security - A Survey

Heju Jiang* , Jasvir Nagra, Parvez Ahammad∗

Instart Logic, Inc.{hjiang, jnagra, pahammad }@instartlogic.com

ABSTRACTThe idea of applying machine learning(ML) to solve prob-lems in security domains is almost 3 decades old. As infor-mation and communications grow more ubiquitous and moredata become available, many security risks arise as well asappetite to manage and mitigate such risks. Consequently,research on applying and designing ML algorithms and sys-tems for security has grown fast, ranging from intrusion de-tection systems(IDS) and malware classification to securitypolicy management(SPM) and information leak checking.In this paper, we systematically study the methods, algo-rithms, and system designs in academic publications from2008-2015 that applied ML in security domains. 98% of thesurveyed papers appeared in the 6 highest-ranked academicsecurity conferences and 1 conference known for pioneeringML applications in security. We examine the generalizedsystem designs, underlying assumptions, measurements, anduse cases in active research. Our examinations lead to 1) ataxonomy on ML paradigms and security domains for fu-ture exploration and exploitation, and 2) an agenda detail-ing open and upcoming challenges. Based on our survey,we also suggest a point of view that treats security as agame theory problem instead of a batch-trained ML prob-lem. Keywords: Security, Machine Learning, Large-scaleApplications, Game Theory, Security Policy Management

1. INTRODUCTION AND MOTIVATIONSince Dorothy Denning’s seminal 1987 paper on intrusiondetection [1], ML and data mining(DM) have steadily gainedattention in security applications. DARPA’s 1998 networkintrusion detection evaluation [2], and KDD(Conference onKnowledge Discovery and Data Mining) Cup’s 1999 chal-lenge [3, 4] have raised profile of ML in security contexts.Yet, constrained by hardware and system resources[4], large-scale ML applications did not receive much attention formany years.

∗Corresponding authors

In 2008, ACM Conference on Computer and Communica-tions Security(CCS) hosted the 1st Artificial Intelligence inSecurity(AISec) Workshop, which has since been a dedicatedvenue at a top-level security conference for the intersectionof ML and security. From 2008, the pace of research andpublicity of ML in security started to accelerate in academiccommunities (section 2.3), and industry venues (e.g. BlackHat, RSA) also shifted interests. For instance, ML in secu-rity was still a topic of minority interest at Black Hat USA2014 in August [5], but at RSA 2016 in February, the ma-jority of vendors claimed to deploy ML in their products[6]. A part of this shift may be motivated by the suddenincrease in blackswan events like the discovery of CRIME,BEAST and Heartbleed vulnerabilities. The discovery ofthese vulnerabilities suggest that organizations may be at-tacked via previously unknown classes of attacks. To defendagainst these types of attacks requires monitoring not justfor known vectors attacks, but also for behavior suggestive ofa compromised machine. The latter requires the gatheringand analysis of much larger sets of data.

Advances in hardware and data processing capacities en-abled large-scale systems. With increasing amount of datafrom growing numbers of information channels and devices,the analytic tools and intelligent behaviors provided by MLbecomes increasingly important in security. With DARPA’sCyber Grand Challenge final contest looming [7], researchinterest in ML and security is becoming even more conspic-uous. Now is the crucial time to examine research worksdone in ML applications and security. To do so, we studiedthe state-of-art of ML research in security between 2008 andearly 2016, and systematize this research area in 3 ways:

1. We survey cutting-edge research on applied ML in se-curity, and provide a high-level overview taxonomy ofML paradigms and security domains.

2. We point to research challenges that will improve, en-hance, and expand our understanding, designs, andefficacy of applying ML in security.

3. We emphasize a position which treats security as agame theory problem.

2. OVERVIEW: STRUCTURE & SCOPEWhile we realize there are different ways to classify existingsecurity problems based on purpose, mechanism, targetedassets, and point of flow of the attack, our SoK’s sectionstructure is based on the “Security and Privacy” category of2012 ACM Computing Classification System[8], which is a

1

arX

iv:1

611.

0318

6v1

[cs

.CR

] 1

0 N

ov 2

016

Page 2: SoK: Applying Machine Learning in Security - A Survey · SoK: Applying Machine Learning in Security - A Survey Heju Jiang*, Jasvir Nagra, Parvez Ahammad ... tions Security(CCS) hosted

Figure 1: Generalization of surveyed ML System Designs in Security; details in section 2.1

combination of specific use cases(e.g. malware, phishing),technique (e.g. information flow), and targeted assets(e.g.web application, proxies). We present the state-of-art MLapplications in security as the following: Section 3 and Table2 & 3 discusses Network Security1, Section 4 and Table 4surveys Security Services, Section 5 and Table 5 specifiesadvances in Software & Applications Security, Section 6 andTable 6 & 7 lays out taxonomy for System Security, andSection 7 and Table 8, 9 & 10 summarizes progress since2008 in Malware Detection, IDS, and Social Engineering.Throughout the survey, we share our frameworks for MLsystem designs, assumptions, and algorithm deployments insecurity.

We focus our survey on security applications and security-related ML and AI problems on the defense side, hence ourscope excludes theories related to security such as differen-tial privacy and privacy-preservation in ML algorithms[9,10, 11], and excludes ML applications in side channel at-tacks such as [12, 13, 14]. Partly because there is already a2013 SoK on evolution of Sybil defense[15] in online socialnetworks(OSN), and partly because we would like to leave itas a small exercise to our readers, we excluded Sybil defenseschemes in OSN as well[16, 17, 18, 19, 20]. Still with a broadbase, we propose an alternative position to frame securityissues, and we also recommend a taxonomy for ML applica-tions in security use cases. Yet, we do not conclude with aterminal list of “right” or “correct” approaches or methods.We believe that the range of the applications is too wide tofit into one singular use case or analysis framework. Instead,we intend this paper as a systematic design and methodoverview of thinking about researching and developing MLalgorithms and applications, that will guide researchers intheir problem domains on an individual basis. We target ourwork to security researchers and practitioners, so we assumethat our readers have general knowledge for key security do-mains and awareness of common ML algorithms, and we alsodefine terms when needed.

We agree with assessment of top conferences in [21]2. We

1All papers are listed in chronological order with the firstauthor’s last name followed by venue acronym and year.2We use the same conference-ranking websites to firstdecide our list of top-level conferences:(1)MicrosoftAcademic Search - Top Conferences in Security &Privacyhttp://academic.research.microsoft.com/RankList?entitytype=3&topDomainID=2&subDomainID=2,(2)Guofei Gu - Computer Security Conference Rankingand Statistichttp://faculty.cs.tamu.edu/guofei/sec_

systematically went through all proceedings between 2008and early 2016 of the top 6 network- and computer-securityconferences to collect relevant papers. Because of KDD’searly and consistent publication record on ML applicationsin Security, and its status as a top-level venue for ML andDM applications, we also include KDD’s 2008-2015 proceed-ings. To demonstrate the wide-ranging research attentiondrawn to ML applications in security, we also added chosenselections from the workshop AISec, International Confer-ence on Machine Learning(ICML), Neural Information Pro-cessing Systems(NIPS), and Internet Measurement Confer-ence(IMC) papers between 2008-2015, mostly in the “FutureDevelopment” section.

2.1 ML System Designs in SecurityFigure 1 shows the generalization of ML system designswhen applied in security, that emerged from our survey ofthe papers(the legend is on the figure’s bottom left). In dif-ferent use cases, the system components may embody differ-ent names, but their functionalities and positions are cap-tured in the figure. For example:

1. Knowledge base is baseline of known normality and/orabnormality, depending on use cases, they include butare not limited to blacklist(BL), whitelist(WL), watch-list; known malware signatures, system traces, andtheir families; initial set of malicious web pages; ex-isting security policies or rules, etc..

2. Data sources are where relevant data is collected.They can be either off-line or live online data feed, e.g.malware traces collected after execution(off-line), URLstream(online).

3. Training data are labeled data which are fed to classi-fiers in training. They can be standard research datasets,new data(mostly from industry) labeled by human,synthetic datasets, or a mix.

4. Pre-processor and feature extractor construct fea-tures from data sources, for example: URL aggrega-tors, graph representations, SMTP header extractions,n-gram model builders.

conf_stat.htm, and (3) Jianying Zhou - Top Crypto andSecurity Conferences Ranking http://icsd.i2r.a-star.edu.sg/staff/jianying/conference-ranking.html. All3 rankings have the same top 6, and because Crypto andEurocrypt do not have papers within our focus, we decidedon these 4: ACM CCS, IEEE S&P(hereafter “SP”), NDSS,and USENIX Security(hereafter “Sec” or “USENIX”)

2

Page 3: SoK: Applying Machine Learning in Security - A Survey · SoK: Applying Machine Learning in Security - A Survey Heju Jiang*, Jasvir Nagra, Parvez Ahammad ... tions Security(CCS) hosted

Supervised Semi-supervised Unsupervised HITL Game TheoryAttacker TypePassive 3.1/4/5/6.1/6.2/7.1/7.2/7.3 5/6.1/7.2 3.1/3.2/5/6.1/6.2/7.1/7.2 5/6.1Semi-aggressive 3.1/3.2/4/5/6.1/6.2/7.1/7.2/7.3 7.1/7.3 3.1/5 6.2Active 7.1 6.2/7.2Means of AttackServer 6.1/7.2 7.2 6.2Network 3.1/3.2/5/7.2/7.3 5/7.3 3.1/3.2/6.1/7.1 6.2Client app 5 5 5/6.1User 3.1/4/7.2/7.3 7.2 4/7.2 7.2Client machine 6.1/6.2/7.1 6.1/7.1 6.2/7.1Purpose of Attack [22, 23]Confidentiality 4/5/6.1/7.2 4/6.2/7.2Availability 3.1/6.2/7.3 7.2 3.1 6.2/7.2Integrity 3.2/5/6.1/6.2/7.1/7.2/7.3 5/6.1/7.1/7.2/7.3 3.2/5/6.1/7.1/7.2 5/6.1

Table 1: Overview matrix of sections covering ML paradigms and security domains; details in section 2.2

Dynamic analyzer and static analyzer are used most oftenin malware-related ML tasks, and human feedback loop isadded when the system’s design intends to be semi-supervisedor human-in-the-loop(HITL).

2.2 ML Paradigms in Security ProblemsTable 1 shows a matrix with rows indicating different waysof classifying the security problems, and the columns show-ing well-understood ML paradigms. Based on the threatmodels and modeling purposes presented in the papers, wequalitatively group the attacker into three groups. If thereare multiple attacker types in one section, the section’s num-bering appears multiple times accordingly.

1. Passive attackers make no attempt to evade detec-tions; their behaviors fit into descriptions of the threatmodels.

2. Semi-aggressive attackers have knowledge of the de-tectors, and only attempt to evade detections.

3. Active attackers do not only have knowledge of thedetectors and attempt to evade detections, but alsoactively try to poison, mislead, or thwart detection.

4. Knowledge of attackers, is the information in at leastone of the five aspects: the learning algorithms them-selves, the algorithms’ feature spaces, the algorithm’sparameters, training and evaluation data - regardlessof being labeled or not - used by the algorithms, anddecision feedback given by the algorithms [23, 24, 25].

Influenced by [22, 23], we extend their definitions, and qual-itatively categorize attackers’ primary purpose as to com-promise confidentiality, availability or integrity of legitimatesystems, services, and users.

1. Attacks on confidentiality compromise the confiden-tial or secret information of systems, services, or users(e.g. password crackers).

2. Attacks on availability make systems and services un-usable with unwanted information, requests, or manyerrors in defense schemes (e.g. DDoS, spam).

3. Attacks on integrity masquerade maliciously inten-tions as benign intentions in systems, services, andusers (e.g. malware).

We also define ML paradigms shown in the matrix:

1. Supervised learning uses labeled data for training.2. Semi-supervised learning uses both labeled and un-

labeled data for training.3. Unsupervised learning has no labeled data available

for training.4. Human-in-the-loop(HITL) learning incorporates ac-

tive human feedback to algorithm’s decisions into theknowledge base and/or algorithms.

5. Game Theory(GT)-based learning considers learn-ing as a series of strategic interactions between themodel learner and actors with conflicting goals. Theactors can be data generators, feature generators, chaotichuman actors, or a combination[23, 24, 26, 27, 28].

For ”Means of Attacks” in Table 1, server, network, anduser are straightforward and intuitive, so here we only de-scribe “client app” and “client machine”. Client app is anybrowser-based means of attack on any client device, andclient machine is any non-browser-based means of attackon any client device.

As shown in Table 1, the majority of surveyed papers in dif-ferent security domains use supervised learning to deal withpassive or semi-aggressive attackers. However, the core re-quirement of supervised learning - labeled data - is not al-ways viable or easy to obtain, and authors have repeatedlywritten about the difficulty of obtaining labeled data fortraining. Based on this observation, we conclude that explor-ing semi-supervised and unsupervised learning approacheswould expand the research foundation of ML applications insecurity domains, because semi-supervised and unsupervisedlearning can utilize unlabeled datasets which had not beenused by supervised learning approaches before.

Moreover, during our survey, we realized that many ML ap-plications in security assume that training and testing datacome from the same distribution (in statistical terms, this isthe assumption of stationarity). However, in the real world,it is highly unlikely that data are stationary, let alone thatthe data could very well be generated by an adversarial datagenerator producing training and/or testing data sets, as thecase in [26], or simply be generated responding to specific

3

Page 4: SoK: Applying Machine Learning in Security - A Survey · SoK: Applying Machine Learning in Security - A Survey Heju Jiang*, Jasvir Nagra, Parvez Ahammad ... tions Security(CCS) hosted

Supervised Semi-supervised Unsupervised HITL Game TheoryAttacker TypePassive 58(49%) 7(5.9%) 24(20%) 2(1.7%) 0(0%)Semi-aggressive 18(15%) 4(3.4%) 3(2.5%) 0(0%) 1(0.85%)Active 0(0%) 0(0%) 0(0%) 0(0%) 2(1.7%)Means of AttackServer 4(3.4%) 1(0.85%) 1(0.85%) 0(0%) 0(0%)Network 17(14.4%) 4(3.4%) 11(9.3%) 0(0%) 1(0.85%)Client app 4(3.4%) 0(0%) 1(0.85%) 2(1.7%) 0(0%)User 31(26%) 2(1.7%) 9(7.6%) 0(0%) 2(1.7%)Client machine 20(17%) 4(3.4%) 5(4.2%) 0(0%) 0(0%)Purpose of Attack [22, 23]Confidentiality 16(13.6%) 0(0%) 7(6%) 0(0%) 0(0%)Availability 9(7.6%) 1(0.85%) 5(4.2%) 0(0%) 3(2.5%)Integrity 51(43.2%) 10(8.5%) 15(12.7%) 2(1.7%) 0(0%)

Table 2: Overview matrix of sections covering ML paradigms and security domains. Number in each box isthe number of papers in our survey that belongs to the category, followed by a parenthesis with the percentageof the number among all counted papers.

models as in [29]. Our observation from the comprehen-sive survey confirmed [23]’s statement, and we propose thatGT-based learning approaches and HITL learning system de-signs should be explored more, in order to design more ef-ficient security defense mechanisms to deal with active andunpredictable adversaries. At the same time, human knowl-edge and judgment in HTIL should go beyond feature en-gineering, to providing feedback to decisions made by MLmodels. Some theory-leaning papers have modeled spamfiltering as Bayesian games or Stackelberg games[26, 29].Use cases in data sampling, model training with poisonedor low-confidence data have also been briefly explored inliterature[27, 30, 31].

2.3 Timeline of ML in SecurityBased on seminal works and establishments in notable venues,the gradually increasing levels of interest in ML research ap-plied to Security is fairly visible. Here we gathered somemilestone events:

1. 1987: Denning published “An Intrusion Detection Sys-tem” [1], first framing security as a learning problem

2. 1998: DARPA IDS design challenge[2]3. 1999: KDD Cup IDS design challenge[3, 4]4. 2008: CCS hosted the 1st AISec workshop. Continues

to operate each year[9]5. 2007, 2008: Twice, KDD hosted the International Work-

shop on Privacy, Security, and Trust(PinKDD)[32]6. 2010, 2012: Twice, KDD hosted Intelligence and Se-

curity Informatics Workshop(ISI)[33, 34]7. 2011: “Adversarial Machine Learning”published in 4th

AISec[23]8. 2012: “Privacy and Cybersecurity: The Next 100 Years”

by Landwehr et al published[35]9. 2013: Manifesto from Dagstuhl Perspectives Workshop

published as “Machine Learning Methods for Com-puter Security” by Joseph et al [36].

10. 2014: KDD hosted its 1st “Security & Privacy” sessionin the main conference program[37]

11. 2014: ICML hosted its 1st, and so far the only work-shop on Learning, Security, and Privacy(LSP)[38]

12. 2016: AAAI hosted its 1st Artificial Intelligence forCyber Security workshop(AISC)[39]

2.4 Related WorkDespite the surge of research interests and industry appli-cations in the intersection of ML and security, few surveysor overviews were published after 2008, the watershed yearof increasing interest in this particular domain. In 2013[40]surveyed server-side web application security, [41] sur-veyed data mining applied to security in the cloud focusingon intrusion detection, [42] discussed an ML perspective innetwork anomaly detection. While they are helpful and in-formative, the former two are limited by their scope andperspective, and the latter serves as a textbook, hence ab-sent the quintessential of survey - mapping the progressesand charting the state-of-art. A collection of papers in 2002and 2012 [43] discussed applications of DM in computer se-curity, but lacks a systematic survey on ML applications inresolving security issues. [44] briefly compared two networkanomaly detection techniques, but limited in scope. [45] of2009 conducted a comprehensive survey in anomaly detec-tion techniques, some involving discussions of security do-mains. The Dagstuhl Manifesto in 2013 [36] articulated thestatus quo and looked to the future of ML in security, butthe majority of the literature listed were published before2008. [46] of 2010 highlighted use cases and challenges forML in network intrusion detection, but did not incorporatea high-level review of ML in security in recent years.

3. NETWORK SECURITY3.1 Botnets and HoneypotsResearch works on botnets among our surveyed literaturefocuses mainly on designing systems to detect command-and-control(C&C) botnets, where many bot-infected ma-chines are controlled and coordinated by few entities to carryout malicious activities[47, 48, 49]. Those systems need tolearn decision boundaries between human and bot activities,therefore ML-based classifiers are at the core of those sys-tems, and are often trained by labeled data in supervisedlearning environments. The most popular classifier is sup-port vector machines(SVMs) with different kernels, whilespatial-temporal time series analysis and probabilistic in-ferences are also notable techniques employed in ML-basedclassifiers. Topic clustering, mostly seen in natural languageprocessing(NLP), is used to build a large-scale system to

4

Page 5: SoK: Applying Machine Learning in Security - A Survey · SoK: Applying Machine Learning in Security - A Survey Heju Jiang*, Jasvir Nagra, Parvez Ahammad ... tions Security(CCS) hosted

identify bot queries [50]. In botnet detection literature, 3core assumptions are widely shared:

1. Botnet protocols are mostly C&C [47, 48, 51]2. Individual bots within same botnets behave similarly

and can be correlated to each other [51, 52]3. Botnet behaviors are different and distinguishable from

legitimate human user, e.g. human behaviors are morecomplex[53, 54, 55]

Other stronger assumptions include that bots and humansinteract with different server groups [56], and content fea-tures from messages generated by bots and human are inde-pendent [53]. While classification techniques differ, WLs,BLs, hypothesis testing, and a classifier [47, 49, 53] areusual system components. Attempts have been made toabstract state machine models of network to simulate real-world network traffic and create honeypots [57]. Groundtruths are often heuristic [52], labeled by human experts, ora combination - even at large scale, human labeled groundtruths are used, for example in [55], game masters’ visualinspections serve as ground truth to detect bots in onlinegames. In retrospect, the evolution of botnet detection isclear: from earlier and more straightforward uses of classi-fication techniques such as clustering and NB, the researchfocus has expanded from the last step of classification, tothe important preceding step of constructing suitable met-rics, that measures and distinguishes bot-based and human-based activities[50, 55].

3.2 Proxies and DNSClassifying DNS domains that distribute or host malware,scams, and malicious content has drawn research interestespecially in passive DNS analysis. There are two mainapproaches: reputation system[58, 59, 60] and classifier[61,62]. Reputation system scores benign and malicious domainsand DNS hosts, and a ML-based classifier learns bound-aries between the two. Nonetheless, both reputation systemand classifier use various decision trees, random forest(RF),naıve Bayes(NB), SVM, and clustering techniques for mostlysupervised learning-based scoring and classification. Manyfeatures used are from protocols and network infrastruc-tures, e.g. border gateway protocol(BGP) and updates, au-tomated systems(AS), registration, zone, hosts, and publicBLs. Similar to botnet detectors, variations of BL, WL,and honeypots[63] are used in similar functions as knowledgebases, while ground truths are often taken from public BLs,limited WLs, and anti-virus(AV) vendors such as McAfeeand Norton [58, 59, 61]. But before any ML attempts takeplace, most studies would assume the following:

1. Malicious uses of DNS are distinct and distinguishablefrom legitimate DNS services.

2. The data collection process - regardless of differentnames such as data flow, traffic recorder, or packet as-sembler - follows a centralized model. In other words,all the traffic/data/packets flow through certain cen-tral node or nodes to be collected.

Stronger assumptions include that AS hijackers cannot ma-nipulate AS path before it reaches them[60], and malicious-ness will trigger an accurate IP address classifier to fail[64].Besides analyzing the status quo, [64, 65, 66] showed efforts

to preemptively protect network measurement integrity andpredict potentially malicious activities from web domainsand IP address spaces.

4. SECURITY SERVICESBoth offense and defense for access control, authentication,and authorization reside within the domain of Security Ser-vices. Defeating audio and visual CAPTCHAs(CompletelyAutomated Public Turing test to tell Computers and Hu-mans Apart)[67, 68, 68, 69, 70], cracking passwords[71, 72,73], measuring password strengths[72, 74, 75], and uncov-ering anonymity[76, 77, 78, 79] are 4 major use cases. Onthe offense, specialized ML domains such as computer vi-sion, signal processing, and NLP automate attacks on userauthentication services i.e. textual or visual passwords andCAPTCHAs, and uncover hidden identities and services. Onthe defense side, entropy-based and ML-based systems cal-culate password strengths. Other than traditional user au-thentication schemes, behavioral metrics of users are alsointroduced. Following the generalized ML pipeline shown inFigure 1, the “classifier” is replaced by “recognition engine”in the password cracking process, and “user differentiationengine” in authentic metric engineering [80, 81]. Hence theprocess becomes: “Data source→ Pre-process & feature ex-traction→ Recognition or user differentiation engine→ De-cision” for ML-based security services. A noteworthy trendto observe, is that attacks on CAPTCHAs are getting moregeneralized - from utilizing SVM in 2008 to attack a specifictype of text CAPTCHA[67], in 2015 a generic attach ap-proach to attack text-based CAPTCHA [70] was proposed.

ML-based attacks on textual and visual CAPTCHA typi-cally follow the 4-step process:

1. Segmentation: e.g. signal to noise ratio(SNR) for au-dio; hue, color, value(HSV) for visual [67, 69, 70, 82]

2. Signal or image representation: e.g. discrete Fouriertransformation(audio)[82], letter binarization(visual) [70]

3. Feature extraction: e.g. spectro-temporal features, char-acter strokes [67, 82]

4. Recognition: K-nearest neighbor(KNN), SVM(RBF ker-nel), convolutional neural networks(CNN) [68, 71]

On the side of password-related topics in security services,there are 2 password models: whole-string Markov mod-els, and template-based models [73]. Concepts in statisticallanguage modeling, such as natural language encoder and n-grams associated with Markov models(presented as directedgraphs with nodes labeled by n-grams), and context-freegrammars are common probabilistic foundations to buildpassword strength meters and password crackers [71, 72, 82].

5. SOFTWARE & APPLICATION SECURITYML research in software and applications security mostlyconcentrate on web application security in our survey, andhave used supervised learning to train popular classifierssuch as NB and SVM to detect web-based malware andJavaScript(JS) code[83, 84], filter unwanted resources andrequests such as malicious advertisements[85, 86, 87, 88],predict unwanted resources and requests(e.g. future black-listed websites)[89, 90, 91], and quantify web applicationvulnerabilities[92]. While [91] explored building web appli-cation anomaly detector with scarce training data, most use

5

Page 6: SoK: Applying Machine Learning in Security - A Survey · SoK: Applying Machine Learning in Security - A Survey Heju Jiang*, Jasvir Nagra, Parvez Ahammad ... tions Security(CCS) hosted

cases follow the supervised paradigm assuming plentiful la-beled data: Data source(web applications, static/dynamicanalyzers)→ feature extraction(often with specific pre-filter,metrics, and de-obfuscator if needed) → classifiers trainedwith labeled data. Apart from this supervised setting, if ahuman expert’s feedback is added after classifiers’ decisions[85],it forms a semi-supervised system. Regardless of systemdesigns, the usual assumption holds: malicious activitiesor actors are different from normal and benign ones likelydo not change much. The knowledge bases of normalityand abnormality can vary, from historical regular expressionlists[86] to other publicly available detectors[84]. Graph-based algorithms[87] and image recognition[90] are both usedin resource filtering, but in detecting JS malware and eva-sions and quantifying leaks, having suitable measurementsof similarities is a significant focal point. Indeed, from [83,84, 92], ML-based classifiers do well in finding similaritiesbetween mutated malicious code snippets, while the samecode pieces could evade static or dynamic analyzer detec-tions.

6. SYSTEM SECURITY6.1 Vulnerability and Policy ManagementAs Landwehr noticed[93], ML can be applied in SPM. How-ever, in automatic fingerprinting of operating systems(OS),C4.5 decision tree, SVM, RF, KNN - some most commonlyused ML-based classifiers in security - failed to distinguishremote machine instances with coarse- and fine-grained dif-ferences, as the algorithms cannot exploit semantic knowl-edge of protocols or send multi-packet probes [94]. Yet bytaking advantage of semantic and syntactic features, plussemi-supervised system design, [95, 96, 97] showed that SVM(optimizedby sequential minimal optimization[SMO] algorithm), KNN,and NLP techniques do well in Android SPM. On the otherhand, in vulnerability management, [98, 99, 100, 101, 102],clustering techniques have done well in predicting future in-cidents and infer vulnerability patterns in code, as well asNB, SVM, and RF in ranking risks and identifying properpermission levels. Both vulnerability management and SPMalso focus on devising proper metrics for ML applications:from heuristics based on training set [100], Jaro distance[101], to outside reputation system oracles [99], metrics areneeded to compare dependency graphs, string similarities,and inferred vulnerability patterns. In most use cases, be-cause of the need for labeled data to train supervised learn-ing systems, many systems follow the generalized trainingprocess in Figure 1: “Knowledge base → offline trainer →online or offline classifier”. When policy management deci-sions need feedback, a HITL design is in place where endhuman users’ feedback is directed to knowledge base. Onedistinguishing tradition in ML applications research in thisdomain, is a strong emphasis on measurement - selectingor engineering proper similarity or scoring metrics are oftenimportant points of discussion in research literature. Fromearlier uses of heuristics in clustering algorithms, to morerecent semantic connectivity measurement applied in semi-supervised systems, both the metrics and the system de-signs for vulnerability and security policy management haveevolved to not only identify, but also to infer and predictfuture vulnerable instances.

6.2 Information Flow and DDoS

Compared to other security domains, ML research in in-formation flow and DDoS focus more on evasion tactics andlimits of ML systems in adversarial environments. Hence wegrouped together the two sub-domains, and marked stud-ies in Table 7 with “(IF)” and “(DDoS)” accordingly. ForDDoS[27, 103, 104], the usual assumption is that patternsof attack and abuse traffic are different from normal traffic[103], but [104] challenged it by proposing an adversary whocan generate attributes that look as plausible as actual at-tributes in benign patterns, and caused failure in ML-basedautomated signature generation to distinguish benign andmalicious byte sequences. Then, [27] introduced GT to eval-uate DDoS attack and defense in real-world. For informa-tion flow[24, 105, 106, 107], assumptions can take variousforms. In PDF classifiers based on document structural fea-tures, it is malicious PDF has different document structuresthan good PDFs [105]; in Android privacy leak detector, it isthe majority of an Android application’s semantically similarpeers has similar privacy disclosure scenarios[106]. But [24]poses semi-aggressive and active attackers with some infor-mation about the data, feature sets, and/or algorithms, andthen attackers successfully evade ML-based PDF classifiers.Another example is, PDF malware could be classified [105],and then a generic and automated evasion technique basedon genetic programming is successfully experimented[107].Overall, while using SVM, RF, and decision trees trainedwith labeled data to detect and predict DDoS and maliciousinformation and data flows, ML applications in informationflow and DDoS challenge the usual assumption of station-ary adversary behaviors. From collecting local informationonly, to proposing a general game theory-based frameworkto evaluate DDoS attacks and defense, and from using staticmethod to detect malicious PDF file to generic automatedevasion, the scope of ML applications in both DDoS and IFhave expanded and generalized over the years.

7. MALWARE, SOCIAL ENGINEERING &IDS

7.1 Malware Detection and MitigationsProgram-centric or system-centric, there are 3 areas thatdraw most ML application research attention in malware:malware detection[108, 109, 110, 111, 112, 113, 114, 115, 116,117, 118, 119], classifying unknown malware into families[31,120, 121, 122, 123, 124], and auto-extract program or proto-col specifications [125, 126, 127]. Realizing the signatureand heuristic-based malware detectors can be evaded byobfuscation and polymorphism [118], more behavior-basedmatching and clustering systems and algorithms have beenresearched. Figure 1 already shows a generalized ML systemdesign for malware detection and classification, and a moredetailed description is below:

1. Collect malware artifacts and samples, analyze them,execute them in a controlled virtual environment tocollect traces, system calls, API calls, etc. [116, 119,122] 3. Or, directly use information from already com-pleted static and/or dynamic analyses.

3We direct our readers for more details in [21], which eval-uated rigor and prudence of academic research up to 2011that rely on malware execution, and provided a guide rubricfor safety for handling malware datasets

6

Page 7: SoK: Applying Machine Learning in Security - A Survey · SoK: Applying Machine Learning in Security - A Survey Heju Jiang*, Jasvir Nagra, Parvez Ahammad ... tions Security(CCS) hosted

2. Decide or devise similarity measurements between gen-eralized binaries, system call graphs(SCG), functioncall graphs(FCG), etc., then extract features [120, 121,124, 126]

3. Classify malware artifacts into families in-sample, orcluster them with known malware families. The clas-sifiers and clustering engines are usually trained withlabeled data[58, 118, 123]. Popular ones are SVM andRF for classification, and hidden Markov model(HMM)and KNN alongside different clustering techniques.

Even in the use case of auto-extract specifications, super-vised learning with labeled data is needed when behaviorprofiles, state machine inferences, fuzzing, and message clus-tering are present. Evasion techniques of detectors and poi-soning of ML algorithms are also discussed, and typical eva-sion techniques include obfuscation, polymorphism, mimicry,and reflecting set generation[24, 113]. Malware detectionand matching based on structural information and behaviorprofiles[118, 120, 124] show a tendency to use graph-basedclustering and detection algorithms, and similarity measure-ment used in these algorithms have ranged from Jaccard dis-tance to new graph-based matching metrics. While cluster-ing techniques have been mostly used in malware detection,a nearest neighbor technique is explored to evade malwaredetection.

7.2 Social Engineering: Phishing, MaliciousContent and Behaviors

Spams, malicious webpages and URLs that redirect or mis-lead un-suspecting users to malware, scams, or adult con-tent [66] is perhaps as old as civilian use of the Internet.Research literature mostly focus on 3 major areas: detect-ing phishing malicious URLs[128, 129, 130, 131, 132, 133],filtering spam or fraudulent content [26, 29, 134, 135, 136,137, 138, 139, 140, 141], and detecting malicious user ac-count behaviors[142, 143, 144, 145, 146]. Moreover, becausephishing 4 is a classic social engineering tactic, it is often thegateway of many studies to detect malicious URLs, spam,and fraudulent content. To identify malicious URLs, ML-based classifiers draw features from webpage content(lexical,visual, etc.), URL lexical features, redirect paths, host-basedfeatures, or some combinations of them. Such classifiers usu-ally act in conjunction with knowledge bases which are usu-ally in-browser URL BLs or from web service providers. Ifthe classifier is fed with URL-based features, it is commonto set an URL aggregator as a pre-processor before extract-ing features. Mostly using supervised learning paradigm,NB, SVM with different kernels, and LR are popular MLclassifiers for filtering spam and phishing. Meanwhile, GT-based learning to deal with active attackers is also eval-uated in spam filtering. [26] evaluates a Bayesian gamemodel where the defense is not fully informed of the at-tacker’s objectives and the active adversary can exercise con-trol over data generation, [29] proposes a Stackelberg gamewhere spammer reacts to the learner’s moves. Stronger as-sumptions also exist: for example, [139] assumes spammers’

4We agree with [131]’s definition of phishing: “without per-mission, alleges to act on behalf of a third party with theintention of confusing viewers into performing an action withwhich the viewer would only trust a true agent of the thirdparty”

phone blocks follow a beta distribution as conjugate priorfor Bernoulli and binomial distribution. Another social en-gineering tactic is spoofing identities with fake or compro-mised user accounts, and detection of such malicious behav-iors utilize features from user profiles, spatial-, temporal-,and spatial-temporal patterns, and user profiles are used inparticular to construct normality. Graph representation andtrust propagation models are also deployed to distinguishgenuine and malicious accounts with different behavior andrepresentations[144, 145, 146]. Tracing the chronology ofapplying ML to defend against social engineering, one trendis clear: while content-, lexical-, and syntactic-based fea-tures are still being widely used, constructing graph repre-sentations and exploring temporal patterns of redirect paths,events, accounts, and behaviors have been on the rise asfeature spaces for ML applications in defend against socialengineering efforts. Accordingly, the ML techniques havealso changed from different classification schemes to graphicmodels. It is also noteworthy that in [29, 146], addressingadversarial environments’ challenges to ML systems is elabo-rated as primary research areas, instead of a short discussionpoint.

7.3 IDSFrom feature sets to algorithms and systems, IDS has beenextensively studied. However, as [147] cautioned, ML canbe easily conflated with anomaly detection. While both areapplied to build IDS, important difference is that ML aimsto generalize expert-defined distinctions, but anomaly de-tection focuses on finding unusual patterns, while attacksare not necessarily anomalous. For example, [148] distin-guished n-gram model’s different use cases: anomaly de-tection uses it to construct normality(hence more appro-priate when no attack is available for learning), and MLclassifiers learn to discriminate between benign and mali-cious n-grams(hence more appropriate when more labeleddata is present). Since 2008, works at top venues haveadded to the rigor for ML applications in IDS. For exam-ple, a common assumption of IDS is: Anomalous or ma-licious behaviors or traffic flows are fundamentally differ-ent from normal ones, but [147] challenges the assumptionby studying low-cardinality intrusions where attackers don’tsend a large number of probes. To address adversarial learn-ing environment and minimal labels in training data, semi-supervised paradigms, especially active learning, are alsoused[147, 149]. Heterogeneous designs of IDS in different usecases give rise to many ad-hoc evaluations in research works,and a reproducibility and comparison framework was pro-posed to address the issue[150]. Meanwhile, techniques suchas graph-based community detection[151], time series-basedmethods[152, 153], and generalized support vector data de-scription in cyber-physical system and adversarial environ-ment for auto-feature selection[154], have also emerged. Al-though they carry different assumptions of normality andfeature representations, the supervised ML system design re-mains largely the same. Besides the fact the more techniquesand use cases have been proposed, the focus of research inIDS had evolved from discovering new techniques and usecases, to rigorously evaluating fundamental assumptions andworkflows of IDS. For example, while feature selection hasstayed as a major component, there are re-examination ofassumptions and measurements on what constitutes normal-ity and abnormality[151], alternative to more easily acquire

7

Page 8: SoK: Applying Machine Learning in Security - A Survey · SoK: Applying Machine Learning in Security - A Survey Heju Jiang*, Jasvir Nagra, Parvez Ahammad ... tions Security(CCS) hosted

data and use low-confidence data for ML systems[149], andproposal on validating reproducibility of results from differ-ent settings[150].

8. FUTURE DEVELOPMENTOne key goal of our SoK survey is to help researchers lookinto the future. ML applications in security domains areattracting academic research attention as well as industrialinterest, and this presents a valuable opportunity for re-searchers to navigate the landscapes between ML theoriesand security applications. There are also opportunities toexplore if there are some types of ML paradigms that areespecially well suited to particular security problems. Apartfrom highlighting that 1) semi-supervised and unsupervisedML paradigms are more effective in utilizing unlabeled data,hence ease the difficulty of obtaining labeled data, and 2)GT-based ML paradigms and HITL ML system designs willbecome more influential in dealing with semi-aggressive andaggressive attackers, we also share the following seven spec-ulations of future trends, based on our current SoK.

1. Metric Learning: Measurement has become moreand more conspicuous for ML research in security, mostlyin similarity measurement for clustering algorithms[53,65, 76, 120]. Proper measurements and metrics are alsoused to construct ground truths to evaluate ML-basedclassifiers, and also have important roles in featureengineering[55, 140, 155, 156]. Given the ubiquitouspresence of metrics and the complex nature of con-structing them, ML applications in security will bene-fit much from metric learning.

2. NLP: Malicious content, spam, and malware analy-sis and detections have used tools from statistical lan-guage modeling(e.g. n-gram-based representation forstrings in code and HTTP request)[62, 74, 116, 138,141, 157], As textual information explodes, NLP willbecome more widely used beyond source filtering andclustering e.g. [57] use n-gram models to infer statemachines of protocols.

3. Upstream movement of ML in security defensedesigns. In malware detection and classifications, behavior-and signature-based malware classifiers have used in-puts from static and dynamic binary analysis as features[110,111, 123, 126], and [112] already shows RNN can beapplied to automatically recognize functions in binaryanalysis. We also see ML algorithms applied in vulner-ability, device, and security policy management, DDoSmitigation, information flow quantifications, and net-work infrastructure[103, 106, 116, 141]. Hence, it isreasonable to expect that more ML systems and algo-rithms will move upstream in more security domains.

4. Scalability: With increasing amount of data fromgrowing numbers of information channels and devices,scale of ML-based security defenses will become a moreimportant aspect in researching ML applications insecurity [12, 50, 109, 131]. As a result, large-scalesystems will enable distributed graph algorithmsin malware analysis, AS path hijacker tracing, cyber-physical system fault correlation, etc..[49, 56, 72, 96,106, 118]

5. Specialized probabilistic models will be appliedbeyond the context of classifiers, e.g. access control[81].

6. High FP rates have always been a concern for system

architects and algorithm researchers [86, 150]. Reduc-ing FP rates will grow from an ad-hoc component invarious system designs, to independent formal frame-works, algorithms, and system designs.

7. Privacy enforcement was framed as a learning prob-lem recently in [158], in the light of many publicationson privacy-preservation in ML algorithms, and privacyenhancement by probabilistic models[11, 159, 160, 161,162, 163]. This new trend will become more prominent.

9. CONCLUSIONIn this paper, we analyzed ML applications in security do-mains by surveying literature from top venues of our fieldbetween 2008 and early 2016. We attempted to bring clarityto a complex field with intersecting expertises by identify-ing common use cases, generalized system designs, commonassumptions, metrics or features, and ML algorithms ap-plied in different security domains. We constructed a ma-trix showing the intersections of ML paradigms and threedifferent taxonomy structures to classify security domains,and show that while much research has been done, explo-rations in GT-based ML paradigms and HITL ML systemdesigns are still much desired (and under-utilized) in thecontext of active attackers. We point out 7 promising areasof research based on our observations, and argue that whileML applications can be powerful in security domains, it iscritical to match the ML system designs with the underlyingconstraints of the security applications appropriately.

10. ACKNOWLEDGMENTWe would like to thank Megan Yahya, Krishnaprasad Vikram,and Scott Algatt for their time and valuable feedback.

APPENDIXReferences

[1] D. E. Denning, “An intrusion-detection model,” IEEETrans. Softw. Eng., 1987.

[2] (1998) Darpa intrusion detection evaluation. [Online].Available: http://www.ll.mit.edu/ideval/data/

[3] (1999) Kdd cup 1999 data. [Online]. Available: http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html

[4] C. Elkan. (1999) Results of the kdd’99 classifierlearning contest. [Online]. Available: http://cseweb.ucsd.edu/˜elkan/clresults.html

[5] Black hat usa 2014. [Online]. Available: https://www.blackhat.com/us-14/training

[6] Rsa 2016. [Online]. Available: https://www.rsaconference.com/events/us16/expo-sponsors/exhibitor-list

[7] (2016) Darpa cyber grand challenge. [Online]. Avail-able: http://www.cybergrandchallenge.com

[8] ACM. (2012) Acm 2012 computing classificationsystem. [Online]. Available: https://www.acm.org/publications/class-2012

[9] (2016) Aisec 2016 topic classifications. [Online].Available: http://teamcore.usc.edu/people/arunesh/AISec2016/call-for-papers.html

[10] R. Chow, P. Golle, and J. Staddon, “Detecting privacyleaks using corpus-based association rules,” in KDD2008.

8

Page 9: SoK: Applying Machine Learning in Security - A Survey · SoK: Applying Machine Learning in Security - A Survey Heju Jiang*, Jasvir Nagra, Parvez Ahammad ... tions Security(CCS) hosted

Table 3: Botnet DetectorsStudy Goal System Components ML TechniquesGu NDSS’08[47] Detect hosts of botnets and

malicious patternsWL, feature extractor, clustering en-gine

X-Means clustering

GianvecchioNDSS’08[53]

Distinguish between human-and bot-generated chats

Entropy-based classifier, ML-basedclassifier

NB

Gu USENIX’08[52] Network-based anomaly de-tection for botnet C&Cchannels

WL, watch list, protocol matcher,activity and message response detec-tor, spatial-temporal correlation en-gine

sequential probability ration test-ing(SPRT), hierarchical clustering

Hu NDSS’09[54] Detect redirection botnets BL, WL, resource filtering(contentanalysis, URL probing, networkanalysis, IP correlation), classifier

Linear SVM, SPRT

NagarajaUSENIX’10[49]

Find peer-to-peer botnets bytraffic patterns

Traffic monitor honeypot communi-cations, background traffic collector,graph preprocess, inference system

Graph clustering

Chen KDD’11[56] Classify bots or human in-teractions

Flow parser, BL, graph feature ex-tractor, online classifier

Least squares SVM

JacobUSENIX’11[48]

Finding C&C connectionsfrom bot traffic

Host-based information, C&C be-havioral graphs, graphs of knownC&C connections

Graph clustering(metric: non-induced maximum common sub-graph)

JacobUSENIX’12[51]

Detect and contain mali-cious web crawlers

BL, WL, classifier Ensemble classifier: NB,SVM(Gaussian radial basis function“RBF” kernel), association rules

Krueger CCS’12[57] Infer a state machine andmessage format of a protocolfrom network traffic

Message embedder, cluster, infer-ence engine, template engine

N-grams, tokens, binomialtest(to reduce dimensionality),non-negative matrix factoriza-tion(NMF), Markov model

Zhang NDSS’13 [50] Study bot searches on asearch engine

Feature extraction, pattern tree gen-eration, topic clustering, classifier

Spectral clustering(metric: Jaccarddistance), single linkage hierarchicalclustering(SLHC)

Lee NDSS’16 [55] Detect game bots in mas-sively multiplayer onlinerole-playing games

Game logs, transformation, featureextraction, metric computing, clas-sifier

Logistic regression(LR), cosine sim-ilarity, Hurst exponent, exponentialweighted moving average

Table 4: Proxies and DNSStudy Goal System Components ML TechniquesSmallUSENIX’08[63]

Auto-generate responses to mali-cious requests and harness pay-loads

Data collection, pre-process, classi-fier, language model generation

N-grams, Markov model, k-means iterative clustering,Needleman-Wunsch stringalignment for contextual de-pendency, TF/IDF, k-medoidsalgorithm

Prakash KDD’09[164]

Automated BGP updates analyzerto find anomalies

Data flow, temporal analysis, fre-quency analysis

Haar wavelet transform, a fre-quency analysis technique akinto NB

QiuUSENIX’09[60]

Locate prefix hijackers by AS path Target monitor, classifier, rank en-gine

Hierarchical clustering(metric:similarity of AS level paths tothe prefix), rank by likelihood ofseeing hijacking events

SongNDSS’09[62]

Detect anomalies in web traffic Packet assembler, WL, feature ex-tractor, classifier

N-grams feature extraction,Mixture-of-Markov-Chainsmodel, K-means clustering

AntonakakisUSENIX’10 [58]

Dynamic reputation system forDNS

Honeypot, knowledge base, BL, rep-utation system classifier

Decision tree, logit boost, X-means clustering

Bilge NDSS’11[61]

Detect malicious DNS domainswith passive DNS analysis

Traffic recorder, feature extractor,domain BL, learning module, classi-fier

C4.5 decision tree

AntonakakisUSENIX’11 [59]

Detect malware domains at upper-level DNS hierarchy by passiveDNS traffic analysis and globalDNS query resolution

Knowledge base, learning module,training classifier, and feature extrac-tor

RF

FatemiehNDSS’11 [65]

Protect integrity of measurementin whitespace networks

Traffic collector, signal propagation,seed trusted set

SVM(qudratic kernel)

VenkataramanNDSS’13[64]

Infer evolution of malicious activi-ties in regions of Internet

Changes in prefixes and IP addressdistributions

Decision tree

Vissers NDSS’15[66]

Classify parked domains Browser-based client side classifier RF

9

Page 10: SoK: Applying Machine Learning in Security - A Survey · SoK: Applying Machine Learning in Security - A Survey Heju Jiang*, Jasvir Nagra, Parvez Ahammad ... tions Security(CCS) hosted

Table 5: Password Meters & Crackers, and CAPTCHA BreakersStudy Goal Recognition or Classification EngineGolle CCS’08[67] Asirra CAPTCHA attack SVM(RBF kernel)FrankCCS’08[81]

Probabilistic model for role engineering in role-basedcontrol access(RBAC)

Disjoint Decomposition Model(a structural equiv-alent to infinite relational model), Gibbs samplingto infer DOM parameters, clustering

ZanderSec’08[76]

Measurement and sampling techniques to reveal hid-den services

Remote clock-skew estimation and synchronizedsampling

Weir SP’09[71] Password cracker using probabilistic context-freegrammar

Context-free grammar with information aboutprobability distribution of user passwords

BurszteinSP’11[82]

Attack noise-based non-continuous audio CAPTCHA Regularized Least Squares Classification(RLSC)

Zheng CCS’11[80]

Behavioral metric for user verification by mouse move-ment

SVM(RBF kernel)

BruszteinCCS’11[68]

Text-based CAPTCHA strength and weakness. Rec-ommended engines in different phases of attacks, andprinciples for secured CAPTCHAs

SVM(RBF kernel), KNN

CastellucciaNDSS’12[74]

Password strength meter Markov models built on n-gram database

Narayanan SP’12[78]

Identify anonymous authors by stylometry on internetscale

RLSC, linear SVM, NB

Dyer SP’12 [79] Traffic analysis attack against common countermea-sures

NB, multinomial NB, variable n-gram, SVM

Kelley SP’12 [75] Measure password strength by simulating password-cracking algorithms

Brute force Markov models

Gao CCS’13 [69] Attack hollow textual CAPTCHAs CNNAfroz SP’14 [77] Detect multiple identities of anonymous authors Principal component analysis(PCA), L1-

regularized LR, linear SVMMa SP’14[73] Study of probabilistic password models Markov models and context-free grammarChatterjee SP’15[72]

Natural language encoder design for cracking-resistantpassword vaults

Natural language encoder(NLE)

FreemanNDSS’16 [25]

Probabilistic model for user authentication at logintime

Probabilistic model

Gao NDSS’16[70] Generic attack model on text-based CAPTCHAs SVM(kernel unspecified), back-propagation neu-ral network, template matching, CNN

Table 6: Softward and Applications SecurityStudy Goal System Components ML Techniques and Metrics(if specified)Zhang Sec’08[87] Predict websites

blacklisted bybrowsers

Security logs, pre-filter by knowledgebase, relevance ranking, severity assess-ing, final BL

Link analysis and rank relevance corre-lation statistics on weighted undirectedgraph

RobertsonNDSS’10[91]

ML-based web appanomaly detectionwith scarce trainingdata

Knowledge base(local and global), of-fline and online trained classifier

HMM-encoded probabilistic grammar, ag-glomerative hierarchical clustering

ChapmanCCS’11[92]

Quantify side chan-nel leaks from webapps

Web app, web crawler, metrics and fea-ture extraction, quantifier(classifier, en-tropy calculator, Fisher criterion calcu-lator)

Nearest-centroid(metrics: Total-Source-Destination, edit distance, random dis-tance)

SculleyKDD’11[85]

Detect adversarialadvertisements

Model aggregation, labeled data, strat-ified sampling, classifier, human moni-toring feedback

Linear SVM and its variations

CurtsingerSec’11[83]

In-browser JS mal-ware detection

URL BL, scan scripts, code de-obfuscator, feature extractor, classifier

NB(metric: matched strings)

Lu CCS’11[88] Browser componentto detect malicioussearch poisoning

Browser/network/search/user redirec-tion chain information collector, featureextractor, classifier

C4.5 decistion tree

KapravelosUSENIX’13[84]

Detect evasions inmalicious JS

Web information, drive-by-downloaddetector oracle, syntax tree parser, simi-larity measurement, pairing, evasion de-tector

KNN(metric: direct editing distance be-tween nodes)

BhagavatulaAISec’14[86]

Auto-block ad re-sources

Independent classifier(trained with la-beled datasets)

KNN, experimented with NB,SVM(polynomial and RBF kernel, LR)

Soska Sec’14[89] Detect vulnerablewebsites beforecompromise

Websites’ statistics and contents, parserand filter, dynamic feature extractor,classifier

C4.5 decision tree ensembles

Borgolte Sec’15[90]

Detect website de-facement

URL of website(no prior knowledgeof content or structure of webpagesneeded)

Stacked encoder, deep neural network,image-based object recognition

10

Page 11: SoK: Applying Machine Learning in Security - A Survey · SoK: Applying Machine Learning in Security - A Survey Heju Jiang*, Jasvir Nagra, Parvez Ahammad ... tions Security(CCS) hosted

Table 7: Vulnerability and Security Policy ManagementStudy Goal System Components Similarity or Scoring

MetricsML Techniques

MakanjuKDD’09[98]

Mine clusters fromevent logs for faultmanagement

3-step partitions, independentsystem

Token- and bijection-based heuristics

Iterative hierarchical clus-tering

RichardsonAISec’10[94]

Limits of automatedfingerprinting for OS

Probe generator, candidateprobe set, trainer, learner,feedback loop

Not applicable(NA) C4.5 decision tree, rulelearner, SVM-SMO,instance-based clustering

BozorgiKDD’10[99]

Predict possibil-ity and timing ofvulnerable exploits

Vulnerability disclosure reports,feature extractor, offline classi-fier, online predictor

Common VulnerabilitiesScoring System(CVSS)score, distance tomaximum marginhyperplane

Linear SVM

PengCCS’12[100]

Score and rank risksof Android apps

App meta-information corpus,trainer, classifier

Heuristics derived frommodels

NB(basic and with prior),a hierarchical Bayesianmodel built as an exten-sion to latent Dirichlet al-location(LDA)

PanditaUSENIX’13[96]

Identify permissionsentences in mobileapps

Semantic graph generator andengine, flow-of-logic engine, an-notation engine, classifier(withuser feedback loop to build aknowledge base)

NA Part-of-speech(POS)tagging, phrase andclause parsing, namedentity recognition(NER),semantic graph, typeddependency

RasthoferNDSS’14[95]

Identify sources andsinks from code ofany Android API

Input, semantic and syntacticfeature matrices, classifier

NA Linear SVM

YamaguchiSP’15[101]

Infer vulnerabilitysearch patterns in Ccode

Static analyzer, signature gener-ator, inference engine

Jaro distance Complete-linkage cluster-ing

WangUSENIX’15[97]

Semi-supervisedlearning for AndroidSPM

Knowledge base, classifier,co-occurrence learner, learningcombiner, policy refinementgenerator feedback loop

Semantic pattern-to-rule distance

KNN(metric: semanticconnectivity betweenknown and unknownsubjects)

Liu USENIX’15[102]

Predict future dataleak instances fromnetwork logs

Data collection, BL, feature ex-tractor, aggregation, training,classifier

Correlations betweenmisconfigurations andmalicious activities

RF

Table 8: Information Flow and DDoSStudy Goal System Components ML TechniquesBerraCCS’08(DDoS)[103]

Detect and predict abnor-mal traffic patterns in dis-tributed systems

Local model of information collector,classifier, learner

NB

VenkataramanNDSS’08(DDoS)[104]

Limits of pattern-extractionalgorithms in adversarialsetting

Adversary challenges the signature al-gorithm

Generic proof of bounded FPs andreflecting set of plausible aprioris

YanCCS’12(DDoS)[27]

Framework to evaluateDDoS attacks and defense

Network state, attacker and defender’smove spaces, reasonings, decisions

Game theory(semi network-formgame), Bayesian network

VsrndicNDSS’13(IF)[105]

Static method to detect ma-licious PDF files

Hierarchical document structural fea-ture extractor, classifier

C5.0 decision tree, SVM-RBF

RndicSP’14(IF)[24]

Evasion of learning-basedPDF classifier

Attacker’s knowledge of feature sets,training data, classification algo-rithms(one or more areas)

SVM(linear and RBF), RF as testclassifiers

LuNDSS’15(IF)[106]

Detect privacy leak in dataflow

App information collector, data- andsystem-dependence graph, privacy dis-closure analysis, peer voting engine

Ranking based on TF/IDF and co-sine similarity representing semanticsimilarity

XuNDSS’16(IF)[107]

Generic automated evasionsof malicious PDF classifiers

Population initialization, mutation,variant selection, feedback to popula-tion

SVM-RBF, RF

11

Page 12: SoK: Applying Machine Learning in Security - A Survey · SoK: Applying Machine Learning in Security - A Survey Heju Jiang*, Jasvir Nagra, Parvez Ahammad ... tions Security(CCS) hosted

Table 9: Malware Identification, Classification, and MitigationStudy Goal System Components ML TechniquesAhmedCCS’09[119]

Detect malware by WindowsAPI call traces

Kernel mode hook, knowledge base,API call trace, feature extractor, train-ing, classifier

C4.5 decision tree, instance-basedKNN, NB, inductive rule learner,SVM-SMO

BayerNDSS’09[124]

Group similar malwaresfrom taint source andcontrol flow

Knowledge base, dynamic analysis,network analysis, behavior profile gen-erator, clustering

Locality sensitive hashing(LSH),single linkage clustering. Met-ric: Jaccard distance, normalizedcompression distance

ComparettiSP’09[126]

Automated reverse engi-neering of app layer protocolspecifications

Dynamic analysis, session analysis,message clustering, state machine in-ference, labeling, fuzzing

Partitioning around medoids(PAM),metric: Jaccard index

KolbitschSec’09[118]

Detect malware at end hosts Knowledge base, program slicing, be-havioral profile generation, matching

Directed acyclic graph(DAG)matching algorithms

LanziCCS’10[116]

Diversity of system callsstudy(System-centric mal-ware analysis)

Training dataset, system call sequenceminer

N-gram models

Ye KDD’10[123] Auto-group malware sam-ples into families with a clus-ter ensemble

Feature extractor, n-gram slicer, clus-ter ensemble, signature generator, hu-man feedback loop

PCA, Ensemble: hierarchical & par-titional clustering, with weightedsubspace K-medoids

Fredrikson SP’10[125]

Auto-extraction on specifi-cations of class of programs

Training datasets, behavior miner, ex-tracted specifications

Simulated annealing in graph algo-rithm

JangCCS’11[122]

Reduce feature dimensional-ity for large-scale malwareclustering

Static and dynamic analysis, fingerprit-ing, scheduler, clustering engine

Feature hashing(metric: Jaccardsimilarity), aggolomerative hierar-chical clustering

Ye KDD’11[117] Detect malware by file rela-tions

File relation and content collector, fea-ture extractor, classifier

Customized parametric model oncontent and non-parametric one onrelations

AntonakakisSec’12[114]

Detect domain generationalgorithm(DGA)-based mal-wares

Knowledge base, discovery engine,trainer, classifier within networks

X-means clustering, spectral cluster-ing, decision tree, hidden Markovmodel(HMM)

RahmanSec’12[115]

Detect malwares propagatedby social network

User authorization, post crawler,feature extractor, WL, BL,train(manually labeled data) clas-sifier, user feedback

SVM(kernel unspecified)

KongKDD’13[120]

Malware classification onstructural information

Labeled set, function call graph(FCG)extractor, maximum margin distancelearner, ensemble classifier

SVM-Gaussian kernel, KNN (met-ric: maximum margin-guided simi-larity between FCGs)

BorgolteCCS’13[121]

Signature-based malwareclustering

Website parser, DOM tree differencecomputation engine, feature extractor,signature generator

Density-based clustering with or-dering points(metrics: Shannon en-tropy, Kolmogorov complexity)

BiggioAISec’14[31]

Perfect-knowledge attackerpoisoning malware cluster-ing

Malware instruction set representation,message embedder, clustering, classifier

Single linkage clustering

TamersoyKDD’14[108]

Detect malware by file rela-tion graphs

File collector, LSH, graph builder, be-lief propagator

MinHash, LSH, pairwise Markovrandom field, unweighted bipartitegraph

InvernizziNDSS’14[109]

Detect malware downloadsin networks

Network traffic collector, feature ex-tractor, distributed classifier

Decision tree(ground truth)

ArpNDSS’14[110]

Explainable Android mal-ware detection

Broad static analysis, feature embed-der, detector, explanation

Linear SVM

Graziano Sec’15[111]

Detect and forecast mal-ware samples and trendsfrom public dynamic analy-sis sandbox

Dynamic analysis, binary similarity,fine-grained static analysis, classifier

Logistic model tree

Shin Sec’15[112] Recognize functions in mal-ware binaries

Binary, fixed-length subsequence ex-tractor, learner

Recurrent neural network(RNN)with 1 hidden layer(optimized withstochastic gradient descent “SGD”)

KiratCCS’15[127]

Auto-extract malware eva-sion behavior signature

Execution event extractor, call se-quence alignment, event comparison,clustering

Local sequence alignment, IDF, hi-erarchical clustering(metric: Jac-card similarity)

Smutz NDSS’16[113]

Detect malware mimicryevasions with ensembleclassifiers

Two malware classifiers as an ensemble,using mutual agreement analysis

Ensemble: Linear SVM, RF

12

Page 13: SoK: Applying Machine Learning in Security - A Survey · SoK: Applying Machine Learning in Security - A Survey Heju Jiang*, Jasvir Nagra, Parvez Ahammad ... tions Security(CCS) hosted

Table 10: Phishing, Malicious Content and BehaviorsStudy Goal Features System Components ML TechniquesChatterjeeAISec’08[134]

Content-based repu-tation system robustagainst Sybil attacks

Content evolution,contribution history

Editing history, local and globalreputation growth and correlation,scoring

Customized reputationscoring functions

Ma ICML’09[128] Detect maliciouswebsites by URLs inactive learning

Lexical- and host-based features, nocontent features

Live URL feed, labeling engine,feature extractor, classifier withfeedback loop

Perceptron, LR(withSGD), passive-aggressiveand confidence-weightedalgorithm

Ma KDD’09[129] Detect maliciouswebsites from URLs

Lexical- and host-based features ofURLs

Complementary to BLs NB, SVM-RBF, linearSVM, L1-regularized LR

Hao Sec’09[135] Automated reputa-tion engine for spamfilter

Network-level andSMTP header fea-tures, temporalpatterns

Emails, BL, WL, feature extractor,classifier, policy pool

A rule-based linearlearning ensemble

BlumAISec’10[130]

Detect existing andemerging phishingdomains

Lexical- and content-based features

Rule-based BL, online learningclassifier

Deep MD5 match-ing(metric: Kulczynski2 coefficient)

WhittakerNDSS’10[131]

Detect phishing webpages

Hosting properties,content-based andURL-structure fea-tures

Search engine BL, URL aggrega-tor, feature extractor, classifier

LR - online gradient de-scent

ThomasSP’11[136]

Real-time URL spamfilter

URL lexical, con-tent, hosting property,browser, DNS resolver,IP analysis features

Web service with URL stream,URL aggregator, feature collectorand extractor, classifier, feedbackloop(BL annotation training)

L1-regularized LR, LR-SGD

BrucknerKDD’11[29]

Stackelberg gamemodel in spamfiltering

A data generator re-acts to the learner’smoves

Learner, data generator, loss func-tion

Stackelberg-based pre-diction game withdifferent loss functions

LeeNDSS’12[132]

Detect maliciousURLs in streams

Properties of redirectchains of URLs

URL stream data collection, fea-ture extraction, training

L2-regularized LR

Afroz SP’12 [137] Detect frauds in on-line writing styles

Lexical-, syntactic-, content-specificfeatures

Feature extractor, classifier SVM-SMO, exper-imented with NB,SVM-RBF, C4.5, LR

EgeleNDSS’13[142]

Detect compromisedaccounts by behaviorchange

Temporal behaviorand content patterns

Data collection, labeling, training N-gram model, SVM-SMO

GroßhansICML’13[26]

Bayesian game modelin adversarial spamfiltering

Game between alearner of a model anda data generator

Learner, data generator(conflictingbut not necessarily adversarial),cost functions

Bayesian regressionmodel. Baseline: Nashregression, robust ridgeregression, regular ridgeregression

Invernizzi SP’13[138]

Use search engines tofind other maliciouswebpages

Content- and link-based features

Crawler, profiler, search engine’sBL, initial set of malicious pages

N-grams, term extrac-tion

ZhaoKDD’13[133]

Online active learn-ing for maliciousURL detection

Lexical- and host-based features

Live data feed, feature collector,cost-sensitive update, active learn-ing module, classifier

Customized online activecost-sensitive algorithm

Jiang Sec’13[139]

Detect spam cellnumbers

Call detail records Records collector, feature extrac-tor, classifier

Customized Bayesianmodel

ZhangNDSS’14[140]

Detect auto-generated webspam content

Content features Content filter, inverted index en-gine, clustering

Shingling, POS tag-ging(metric:Jaccardindex)

ViswanathSec’14[143]

Unsupervisedanomalous userbehavior detection

Spatial, temporal,spatial-temporal

Normality construction, PCA, de-tection

PCA, KNN

WhalenAISec’14[141]

Distributed contentanomaly detec-tion(CAD)

N-gram of payloads Distributed models over applica-tion servers

Aggregated RF, LR,Bloom filter

BoshmafNDSS’15[144]

Predict benign userswho befriended fakeaccounts

User profiles, commu-nications, activitieshistory

Graph representation, feature ex-traction, trust propagation, rankusers

Louvain method for com-munity detection, modi-fied random walk

StringhiniSec’15[145]

Detect maliciousonline accounts ac-cessed by a commonset of machines

User-agent correlation,account usage, eventtime series

IP mapping, graph representation,feature extraction, detection

Louvain method

Wang Sec’15[146] Detect adversarialcrowdsourcing

User profile, behav-iors, temporal patterns

Normality construction, input, fea-ture extraction, classification

Customized Bayesianmodel, SVM(RBF andpolynomial), RF

13

Page 14: SoK: Applying Machine Learning in Security - A Survey · SoK: Applying Machine Learning in Security - A Survey Heju Jiang*, Jasvir Nagra, Parvez Ahammad ... tions Security(CCS) hosted

Table 11: Intrusion Detection System ResearchStudy Goal Key Assumption(s) ML TechniquesKloftAISec’09[154]

Auto-select features foranomaly detection withone-class SVM

Different feature sets might be variouscharacterizations of normality

Generalized support vector datadescription(SVDD)

GornitzAISec’09[149]

Query low-confidence obser-vations and expand data ba-sis with minimal labeling

Anomaly detection is an active learningtask

SVDD

Symons AISec’12[147]

Non-parametric semi-supervised network IDS.

Attackers’ behaviors are not necessarilydifferent from legitimate human users,e.g. they do not sent large numbers ofprobes

Laplacian regularized leastsquares(RLS)/Bayesian kernelmodel on graphs representingnetwork flow traffic

Ding KDD’12[151]

Augment signature-basedsystem at network edges formulti-layer IDS

Attackers do not respect communityboundaries and exhibit “anti-social be-haviors”

Graph-based community detec-tion

Xie KDD’12 [152] Detect spam review by tem-poral patterns in adversarialenvironment

Normal reviewers’ arrival patterns arePoisson and uncorrelated to rating pat-terns temporally

Multidimensional time se-ries(matching blocks by longestcommon substring algorithm)

WressneggerCCS’13 [148]

Use of n-grams in intrusiondetection

Anomaly detection and ML are two dif-ferent schemes for IDS

Reviewed n-gram models inboth ML and anomaly detectionschemes

MomtazpourKDD’15[153]

Detect correlated invariantsin cyber-physical systems

Local invariants after filtering can still becorrelated

Latent factor auto regressionwith exogenous input to correlatetime series

JubaNDSS’15[150]

Reproducibility and com-parison framework for IDS

Uniform distribution approximates theentire web by Common Crawl URL index

Probablistic model

[11] A. Johnson and V. Shmatikov, “Privacy-preservingdata exploration in genome-wide association studies,”in KDD 2013.

[12] A. Panchenko, F. Lanze, A. Zinnen, M. Henze, J. Pen-nekamp, K. Wehrle, and T. Engel, “Website finger-printing at internet scale,” in NDSS 2016.

[13] C. V. Wright, L. Ballard, S. E. Coull, F. Monrose,and G. M. Masson, “Spot me if you can: Uncoveringspoken phrases in encrypted voip conversations,” in SP2008.

[14] D. Balzarotti, M. Cova, and G. Vigna, “Clearshot:Eavesdropping on keyboard input from video,” in SP2008.

[15] L. Alvisi, A. Clement, A. Epasto, S. Lattanzi, andA. Panconesi, “Sok: The evolution of sybil defense viasocial networks,” in SP 2013.

[16] G. Danezis and P. Mittal, “Sybilinfer: Detecting sybilnodes using social networks.” in NDSS 2009.

[17] G. Wang, T. Konolige, C. Wilson, X. Wang, H. Zheng,and B. Y. Zhao, “You are how you click: Click-stream analysis for sybil detection,” in USENIX Se-curity 2013.

[18] H. Yu, C. Shi, M. Kaminsky, P. B. Gibbons, andF. Xiao, “Dsybil: Optimal sybil-resistance for recom-mendation systems,” in SP 2009.

[19] X. Xing, Y.-l. Liang, S. Huang, H. Cheng, R. Han,Q. Lv, X. Liu, S. Mishra, and Y. Zhu, “Scalable mis-behavior detection in online video chat services,” inKDD 2012.

[20] M. Ciglan, M. Laclavık, and K. Nørvag, “On commu-nity detection in real-world networks and the impor-tance of degree assortativity,” in KDD 2013.

[21] C. Rossow, C. J. Dietrich, C. Grier, C. Kreibich,V. Paxson, N. Pohlmann, H. Bos, and M. Van Steen,“Prudent practices for designing malware experiments:Status quo and outlook,” in SP 2012.

[22] D. of Defense, DoD Trusted Computer System Evalu-ataion Criteria, 1985.

[23] L. Huang, A. D. Joseph, B. Nelson, B. I. Rubinstein,and J. Tygar,“Adversarial machine learning,” in AISec2011.

[24] N. Rndic and P. Laskov, “Practical evasion of alearning-based classifier: A case study,” in SP 2014.

[25] D. M. Freeman, S. Jain, M. Durmuth, B. Biggio, andG. Giacinto, “Who are you? a statistical approach tomeasuring user authenticity.”

[26] M. Großhans, C. Sawade, M. Bruckner, and T. Schef-fer, “Bayesian games for adversarial regression prob-lems.” in ICML 2013.

[27] G. Yan, R. Lee, A. Kent, and D. Wolpert, “Towards abayesian network game framework for evaluating ddosattacks and defense,” in CCS 2012.

[28] P. Laskov and M. Kloft, “A framework for quantitativesecurity analysis of machine learning,” in AISec 2009.

[29] M. Bruckner and T. Scheffer, “Stackelberg games foradversarial prediction problems,” in KDD 2011.

[30] B. Biggio, I. Pillai, S. Rota Bulo, D. Ariu, M. Pelillo,and F. Roli, “Is data clustering in adversarial settingssecure?” in AISec 2013.

[31] B. Biggio, K. Rieck, D. Ariu, C. Wressnegger,I. Corona, G. Giacinto, and F. Roli, “Poisoning be-havioral malware clustering,” in AISec 2014.

[32] KDD. (2008) 2008 2nd pinkdd report. [On-line]. Available: http://dl.acm.org/citation.cfm?id=1540296&dl=ACM&coll=DL&CFID=642625736&CFTOKEN=16618422

[33] ——. (2010) 2010 kdd intelligence and securityinformatics workshop. [Online]. Available: http://dl.acm.org/citation.cfm?id=1938606

14

Page 15: SoK: Applying Machine Learning in Security - A Survey · SoK: Applying Machine Learning in Security - A Survey Heju Jiang*, Jasvir Nagra, Parvez Ahammad ... tions Security(CCS) hosted

[34] ——. (2012) 2012 kdd intelligence and securityinformatics workshop. [Online]. Available: http://cci.drexel.edu/isi/isi-kdd2012/

[35] C. Landwehr, D. Boneh, J. C. Mitchell, S. M. Bellovin,S. Landau, and M. E. Lesk, “Privacy and cybersecu-rity: The next 100 years,” Proceedings of the IEEE,vol. 100, no. Special Centennial Issue, 2012.

[36] A. D. Joseph, P. Laskov, F. Roli, J. Tygar, and B. Nel-son, “Dagstuhl Manifestos, Volume 3, Issue 1, Januaryto December 2013, Complete Issue,” Dagstuhl Mani-festos, 2014.

[37] KDD. (2014) 2014 kdd security and privacy session.[Online]. Available: http://dl.acm.org/citation.cfm?id=2623330

[38] ICML. (2014) Icml 2014 workshop on learning,security and privacy. [Online]. Available: https://sites.google.com/site/learnsecprivacy2014/

[39] AAAI. Aaai 2016 workshop aisc. [Online]. Avail-able: http://www-bcf.usc.edu/˜aruneshs/AICS2016/index.html

[40] X. Li and Y. Xue, “A survey on server-side approachesto securing web applications,” ACM Comput. Surv.,vol. 46, no. 4, April 2014.

[41] P. Aggarwal and M. Chaturvedi, “Application of datamining techniques for information security in a cloud:A survey,” International Journal of Computer Appli-cations, vol. 80, no. 13, 2013.

[42] D. Bhattacharyya and J. Kalita, Network AnomalyDetection: A Machine Learning Perspective. CRCPress, 2013.

[43] D. Barbara and S. Jajodia, Applications of Data Min-ing in Computer Security, ser. Advances in Informa-tion Security, 2012, 2012.

[44] T. Ahmed, B. Oreshkin, and M. Coates, “Machinelearning approaches to network anomaly detection,”in Proceedings of the 2nd USENIX workshop on Tack-ling computer systems problems with machine learningtechniques, 2007.

[45] V. Chandola, A. Banerjee, and V. Kumar, “Anomalydetection: A survey,” ACM Comput. Surv., vol. 41,no. 3, 2009.

[46] R. Sommer and V. Paxson, “Outside the closed world:On using machine learning for network intrusion de-tection,” in SP 2010.

[47] G. Gu, J. Zhang, and W. Lee, “Botsniffer: Detect-ing botnet command and control channels in networktraffic,” in NDSS 2008.

[48] G. Jacob, R. Hund, C. Kruegel, and T. Holz, “Jack-straws: Picking command and control connectionsfrom bot traffic,” in USENIX Security’11.

[49] S. Nagaraja, P. Mittal, C.-Y. Hong, M. Caesar, andN. Borisov, “Botgrep: Finding p2p bots with struc-tured graph analysis,” in USENIX Security’10.

[50] J. Zhang, Y. Xie, F. Yu, D. Soukal, and W. Lee, “In-tention and origination: An inside look at large-scalebot queries,” in NDSS 2013.

[51] G. Jacob, E. Kirda, C. Kruegel, and G. Vigna, “Pub-crawl: Protecting users and businesses from crawlers,”in USENIX Security’12.

[52] G. Gu, R. Perdisci, J. Zhang, W. Lee et al., “Botminer:Clustering analysis of network traffic for protocol-andstructure-independent botnet detection,” in USENIXSecurity 2009.

[53] S. Gianvecchio, M. Xie, Z. Wu, and H. Wang, “Mea-surement and classification of humans and bots in in-ternet chat,” in USENIX security 2008.

[54] X. Hu, M. Knysz, and K. G. Shin, “Rb-seeker: Auto-detection of redirection botnets.” in NDSS 2009.

[55] E. Lee, J. Woo, H. Kim, A. Mohaisen, and H. K.Kim, “You are a game bot!: uncovering game botsin mmorpgs via self-similarity in the wild,” in NDSS2016.

[56] F. Chen, S. Ranjan, and P.-N. Tan, “Detecting botsvia incremental ls-svm learning with dynamic featureadaptation,” in KDD 2011.

[57] T. Krueger, H. Gascon, N. Kramer, and K. Rieck,“Learning stateful models for network honeypots,” inAISec 2012.

[58] M. Antonakakis, R. Perdisci, D. Dagon, W. Lee, andN. Feamster, “Building a dynamic reputation systemfor dns.” in USENIX Security’10.

[59] M. Antonakakis, R. Perdisci, W. Lee, N. Vasiloglou II,and D. Dagon, “Detecting malware domains at the up-per dns hierarchy.” in USENIX Security’11.

[60] T. Qiu, L. Ji, D. Pei, J. Wang, J. J. Xu, and H. Bal-lani, “Locating prefix hijackers using lock,” in USENIXSecurity 2009.

[61] L. Bilge, E. Kirda, C. Kruegel, and M. Balduzzi, “Ex-posure: Finding malicious domains using passive dnsanalysis,” in NDSS 2011.

[62] Y. Song, A. D. Keromytis, and S. J. Stolfo, “Spectro-gram: A mixture-of-markov-chains model for anomalydetection in web traffic.” in NDSS 2009.

[63] S. Small, J. Mason, F. Monrose, N. Provos, andA. Stubblefield, “To catch a predator: A natural lan-guage approach for eliciting malicious payloads,” inUSENIX Security 2008.

[64] S. Venkataraman, D. Brumley, S. Sen, andO. Spatscheck, “Automatically inferring the evolutionof malicious activity on the internet.”

[65] O. Fatemieh, A. Farhadi, R. Chandra, and C. A.Gunter, “Using classification to protect the integrityof spectrum measurements in white space networks.”in NDSS 2011.

[66] T. Vissers, W. Joosen, and N. Nikiforakis, “Parkingsensors: Analyzing and detecting parked domains.” inNDSS 2015.

[67] P. Golle, “Machine learning attacks against the asirracaptcha,” in CCS 2008.

[68] E. Bursztein, M. Martin, and J. Mitchell, “Text-basedcaptcha strengths and weaknesses,” in CCS 2011.

[69] H. Gao, W. Wang, J. Qi, X. Wang, X. Liu, and J. Yan,“The robustness of hollow captchas,” in CCS 2013.

[70] H. Gao, J. Yan, F. Cao, Z. Zhang, L. Lei, M. Tang,P. Zhang, X. Zhou, X. Wang, and J. Li, “A simplegeneric attack on text captchas,” in NDSS 2016.

[71] M. Weir, S. Aggarwal, B. De Medeiros, and B. Glodek,“Password cracking using probabilistic context-freegrammars,” in SP 2009.

15

Page 16: SoK: Applying Machine Learning in Security - A Survey · SoK: Applying Machine Learning in Security - A Survey Heju Jiang*, Jasvir Nagra, Parvez Ahammad ... tions Security(CCS) hosted

[72] R. Chatterjee, J. Bonneau, A. Juels, and T. Risten-part, “Cracking-resistant password vaults using natu-ral language encoders,” in SP 2015.

[73] J. Ma, W. Yang, M. Luo, and N. Li, “A study of prob-abilistic password models,” in SP 2014.

[74] C. Castelluccia, M. Durmuth, and D. Perito, “Adap-tive password-strength meters from markov models.”in NDSS 2012.

[75] P. G. Kelley, S. Komanduri, M. L. Mazurek, R. Shay,T. Vidas, L. Bauer, N. Christin, L. F. Cranor, andJ. Lopez, “Guess again (and again and again): Mea-suring password strength by simulating password-cracking algorithms,” in SP 2012.

[76] S. Zander and S. J. Murdoch,“An improved clock-skewmeasurement technique for revealing hidden services,”in USENIX Security 2008.

[77] S. Afroz, A. C. Islam, A. Stolerman, R. Greenstadt,and D. McCoy, “Doppelganger finder: Taking stylom-etry to the underground,” in SP 2014.

[78] A. Narayanan, H. Paskov, N. Z. Gong, J. Bethencourt,E. Stefanov, E. C. R. Shin, and D. Song, “On the fea-sibility of internet-scale author identification,” in SP2012.

[79] K. P. Dyer, S. E. Coull, T. Ristenpart, and T. Shrimp-ton, “Peek-a-boo, i still see you: Why efficient trafficanalysis countermeasures fail,” in SP 2012.

[80] N. Zheng, A. Paloski, and H. Wang, “An efficient userverification system via mouse movements,” in CCS2011.

[81] M. Frank, D. Basin, and J. M. Buhmann, “A classof probabilistic models for role engineering,” in CCS2008.

[82] E. Bursztein, R. Beauxis, H. Paskov, D. Perito,C. Fabry, and J. Mitchell, “The failure of noise-basednon-continuous audio captchas,” in SP 2011.

[83] C. Curtsinger, B. Livshits, B. Zorn, and C. Seifert,“Zozzle: Fast and precise in-browser javascript mal-ware detection,” in USENIX Security’11.

[84] A. Kapravelos, Y. Shoshitaishvili, M. Cova,C. Kruegel, and G. Vigna, “Revolver: An auto-mated approach to the detection of evasive web-basedmalware,” in USENIX Security 2013.

[85] D. Sculley, M. E. Otey, M. Pohl, B. Spitznagel,J. Hainsworth, and Y. Zhou, “Detecting adversarialadvertisements in the wild,” in KDD 2011.

[86] S. Bhagavatula, C. Dunn, C. Kanich, M. Gupta, andB. Ziebart, “Leveraging machine learning to improveunwanted resource filtering,” in AISec 2014.

[87] J. Zhang, P. A. Porras, and J. Ullrich, “Highly predic-tive blacklisting,” in USENIX Security 2008.

[88] L. Lu, R. Perdisci, and W. Lee, “Surf: detecting andmeasuring search poisoning,” in CCS 2011.

[89] K. Soska and N. Christin, “Automatically detectingvulnerable websites before they turn malicious,” inUSENIX Security 2014.

[90] K. Borgolte, C. Kruegel, and G. Vigna, “Meerkat: De-tecting website defacements through image-based ob-ject recognition,” in USENIX Security 2015.

[91] W. K. Robertson, F. Maggi, C. Kruegel, and G. Vi-gna, “Effective anomaly detection with scarce trainingdata.” in NDSS 2010.

[92] P. Chapman and D. Evans, “Automated black-box de-tection of side-channel vulnerabilities in web applica-tions,” in CCS 2011.

[93] C. Landwehr, “Cyber security and artificial intelli-gence: From fixing the plumbing to smart water,” inAISec 2008.

[94] D. W. Richardson, S. D. Gribble, and T. Kohno, “Thelimits of automatic os fingerprint generation,” in AISec2010.

[95] S. Rasthofer, S. Arzt, and E. Bodden, “A machine-learning approach for classifying and categorizing an-droid sources and sinks.” in NDSS 2014.

[96] R. Pandita, X. Xiao, W. Yang, W. Enck, and T. Xie,“Whyper: Towards automating risk assessment of mo-bile applications,” in USENIX Security 2013.

[97] R. Wang, W. Enck, D. Reeves, X. Zhang, P. Ning,D. Xu, W. Zhou, and A. M. Azab, “Easeandroid: Au-tomatic policy analysis and refinement for security en-hanced android via large-scale semi-supervised learn-ing,” in USENIX Security 2015.

[98] A. A. Makanju, A. N. Zincir-Heywood, and E. E. Mil-ios, “Clustering event logs using iterative partitioning,”in KDD 2009.

[99] M. Bozorgi, L. K. Saul, S. Savage, and G. M. Voelker,“Beyond heuristics: learning to classify vulnerabilitiesand predict exploits,” in KDD 2010.

[100] H. Peng, C. Gates, B. Sarma, N. Li, Y. Qi,R. Potharaju, C. Nita-Rotaru, and I. Molloy, “Usingprobabilistic generative models for ranking risks of an-droid apps,” in CCS 2012.

[101] F. Yamaguchi, A. Maier, H. Gascon, and K. Rieck,“Automatic inference of search patterns for taint-stylevulnerabilities,” in SP 2015.

[102] Y. Liu, A. Sarabi, J. Zhang, P. Naghizadeh, M. Karir,M. Bailey, and M. Liu, “Cloudy with a chanceof breach: Forecasting cyber security incidents,” inUSENIX Security 2015.

[103] J. L. Berral, N. Poggi, J. Alonso, R. Gavalda, J. Tor-res, and M. Parashar, “Adaptive distributed mecha-nism against flooding network attacks based on ma-chine learning,” in AISec 2008.

[104] S. Venkataraman, A. Blum, and D. Song, “Limits oflearning-based signature generation with adversaries,”in NDSS 2008.

[105] N. Srndic and P. Laskov, “Detection of malicious pdffiles based on hierarchical document structure,” inNDSS 2013.

[106] K. Lu, Z. Li, V. P. Kemerlis, Z. Wu, L. Lu, C. Zheng,Z. Qian, W. Lee, and G. Jiang, “Checking more andalerting less: Detecting privacy leakages via enhanceddata-flow analysis and peer voting.” in NDSS 2015.

[107] W. Xu, Y. Qi, and D. Evans, “Automatically evadingclassifiers,” in NDSS 2016.

[108] A. Tamersoy, K. Roundy, and D. H. Chau, “Guilt byassociation: large scale malware detection by miningfile-relation graphs,” in KDD 2014.

16

Page 17: SoK: Applying Machine Learning in Security - A Survey · SoK: Applying Machine Learning in Security - A Survey Heju Jiang*, Jasvir Nagra, Parvez Ahammad ... tions Security(CCS) hosted

[109] L. Invernizzi, S. Miskovic, R. Torres, C. Kruegel,S. Saha, G. Vigna, S.-J. Lee, and M. Mellia, “Nazca:Detecting malware distribution in large-scale net-works.” in NDSS 2014.

[110] D. Arp, M. Spreitzenbarth, M. Hubner, H. Gascon,and K. Rieck, “Drebin: Effective and explainable de-tection of android malware in your pocket.” in NDSS2014.

[111] M. Graziano, D. Canali, L. Bilge, A. Lanzi, andD. Balzarotti, “Needles in a haystack: Mining infor-mation from public dynamic analysis sandboxes formalware intelligence,” in USENIX Security 2015.

[112] E. C. R. Shin, D. Song, and R. Moazzezi, “Recog-nizing functions in binaries with neural networks,” inUSENIX Security 2015.

[113] C. Smutz and A. Stavrou, “When a tree falls: Usingdiversity in ensemble classifiers to identify evasion inmalware detectors.”

[114] M. Antonakakis, R. Perdisci, Y. Nadji, N. Vasiloglou,S. Abu-Nimeh, W. Lee, and D. Dagon, “From throw-away traffic to bots: Detecting the rise of dga-basedmalware,” in USENIX Security’12.

[115] M. S. Rahman, T.-K. Huang, H. V. Madhyastha, andM. Faloutsos, “Efficient and scalable socware detectionin online social networks,” in USENIX Security’13.

[116] A. Lanzi, D. Balzarotti, C. Kruegel, M. Christodor-escu, and E. Kirda, “Accessminer: Using system-centric models for malware protection,” in CCS 2010.

[117] Y. Ye, T. Li, S. Zhu, W. Zhuang, E. Tas, U. Gupta,and M. Abdulhayoglu, “Combining file content and filerelations for cloud based malware detection,” in KDD2011.

[118] C. Kolbitsch, P. M. Comparetti, C. Kruegel, E. Kirda,X. Zhou, and X. Wang, “Effective and efficient mal-ware detection at the end host,” in USENIX Security2009.

[119] F. Ahmed, H. Hameed, M. Z. Shafiq, and M. Farooq,“Using spatio-temporal information in api calls withmachine learning algorithms for malware detection,”in AISec 2009.

[120] D. Kong and G. Yan, “Discriminant malware distancelearning on structural information for automated mal-ware classification,” in KDD 2013.

[121] K. Borgolte, C. Kruegel, and G. Vigna, “Delta: au-tomatic identification of unknown web-based infectioncampaigns,” in CCS 2013.

[122] J. Jang, D. Brumley, and S. Venkataraman, “Bitshred:Feature hashing malware for scalable triage and se-mantic analysis,” in CCS 2011.

[123] Y. Ye, T. Li, Y. Chen, and Q. Jiang, “Automatic mal-ware categorization using cluster ensemble,” in KDD2010.

[124] U. Bayer, P. M. Comparetti, C. Hlauschek, C. Kruegel,and E. Kirda, “Scalable, behavior-based malware clus-tering.” in NDSS 2009.

[125] M. Fredrikson, S. Jha, M. Christodorescu, R. Sailer,and X. Yan, in SP’10.

[126] P. M. Comparetti, G. Wondracek, C. Kruegel, andE. Kirda, “Prospex: Protocol specification extraction,”in SP 2009.

[127] D. Kirat and G. Vigna, “Malgene: Automatic extrac-tion of malware analysis evasion signature,” in CCS2015.

[128] J. Ma, L. K. Saul, S. Savage, and G. M. Voelker, “Iden-tifying suspicious urls: an application of large-scaleonline learning,” in ICML 2009.

[129] ——, “Beyond blacklists: learning to detect maliciousweb sites from suspicious urls,” in KDD 2009.

[130] A. Blum, B. Wardman, T. Solorio, and G. Warner,“Lexical feature based phishing url detection using on-line learning,” in AISec 2010.

[131] C. Whittaker, B. Ryner, and M. Nazif, “Large-scaleautomatic classification of phishing pages.” in NDSS2016.

[132] S. Lee and J. Kim,“Warningbird: Detecting suspiciousurls in twitter stream.” in NDSS 2012.

[133] P. Zhao and S. C. Hoi, “Cost-sensitive online activelearning with application to malicious url detection,”in KDD 2013.

[134] K. Chatterjee, L. de Alfaro, and I. Pye, “Robustcontent-driven reputation,” in AISec 2008.

[135] S. Hao, N. A. Syed, N. Feamster, A. G. Gray, andS. Krasser, “Detecting spammers with snare: Spatio-temporal network-level automatic reputation engine.”in USENIX Security 2009.

[136] K. Thomas, C. Grier, J. Ma, V. Paxson, and D. Song,“Design and evaluation of a real-time url spam filteringservice,” in SP 2011.

[137] S. Afroz, M. Brennan, and R. Greenstadt, “Detectinghoaxes, frauds, and deception in writing style online,”in SP 2012.

[138] L. Invernizzi, P. M. Comparetti, S. Benvenuti,C. Kruegel, M. Cova, and G. Vigna, “Evilseed: Aguided approach to finding malicious web pages,” inSP 2012.

[139] N. Jiang, Y. Jin, A. Skudlark, and Z.-L. Zhang,“Greystar: Fast and accurate detection of sms spamnumbers in large cellular networks using gray phonespace,” in USENIX Security’13.

[140] Q. Zhang, D. Y. Wang, and G. M. Voelker, “Dspin:Detecting automatically spun content on the web,” inNDSS 2014.

[141] S. Whalen, N. Boggs, and S. J. Stolfo, “Model aggre-gation for distributed content anomaly detection,” inAISec 2014.

[142] M. Egele, G. Stringhini, C. Kruegel, and G. Vigna,“Compa: Detecting compromised accounts on socialnetworks,” in NDSS 2013.

[143] B. Viswanath, M. A. Bashir, M. Crovella, S. Guha,K. P. Gummadi, B. Krishnamurthy, and A. Mislove,“Towards detecting anomalous user behavior in onlinesocial networks,” in USENIX Security 2014.

[144] Y. Boshmaf, D. Logothetis, G. Siganos, J. Lerıa,J. Lorenzo, M. Ripeanu, and K. Beznosov, “Integro:Leveraging victim prediction for robust fake accountdetection in osns,” in NDSS 2015.

[145] G. Stringhini, P. Mourlanne, G. Jacob, M. Egele,C. Kruegel, and G. Vigna, “Evilcohort: Detectingcommunit ies of malicious accounts on online services,”in USENIX Security 2015.

17

Page 18: SoK: Applying Machine Learning in Security - A Survey · SoK: Applying Machine Learning in Security - A Survey Heju Jiang*, Jasvir Nagra, Parvez Ahammad ... tions Security(CCS) hosted

[146] G. Wang, T. Wang, H. Zheng, and B. Y. Zhao, “Manvs. machine: Practical adversarial detection of ma-licious crowdsourcing workers,” in USENIX Security2014.

[147] C. T. Symons and J. M. Beaver, “Nonparametric semi-supervised learning for network intrusion detection:combining performance improvements with realisticin-situ training,” in AISec 2012.

[148] C. Wressnegger, G. Schwenk, D. Arp, and K. Rieck, “Aclose look on n-grams in intrusion detection: anomalydetection vs. classification,” in AISec 2013.

[149] N. Gornitz, M. Kloft, K. Rieck, and U. Brefeld, “Ac-tive learning for network intrusion detection,” in AISec2009.

[150] B. Juba, C. Musco, F. Long, S. Sidiroglou-Douskos,and M. C. Rinard, “Principled sampling for anomalydetection.” in NDSS 2015.

[151] Q. Ding, N. Katenka, P. Barford, E. Kolaczyk, andM. Crovella, “Intrusion as (anti) social communica-tion: characterization and detection,” in KDD 2012.

[152] S. Xie, G. Wang, S. Lin, and P. S. Yu, “Review spamdetection via temporal pattern discovery,” in KDD2012.

[153] M. Momtazpour, J. Zhang, S. Rahman, R. Sharma,and N. Ramakrishnan, “Analyzing invariants in cyber-physical systems using latent factor regression,” inKDD 2015.

[154] M. Kloft, U. Brefeld, P. Duessel, C. Gehl, andP. Laskov, “Automatic feature selection for anomalydetection,” in AISec 2008.

[155] T. Holz, C. Gorecki, K. Rieck, and F. C. Freiling,“Measuring and detecting fast-flux service networks.”in NDSS 2008.

[156] B. I. Rubinstein, B. Nelson, L. Huang, A. D. Joseph,S.-h. Lau, S. Rao, N. Taft, and J. Tygar, “Anti-dote: understanding and defending against poisoningof anomaly detectors,” in IMC 2009.

[157] R. Moskovitch, N. Nissim, and Y. Elovici, “Acqui-sition of malicious code using active learning,” inPinKDD’08.

[158] O. Tripp and J. Rubin, “A bayesian approach to pri-vacy enforcement in smartphones,” in USENIX Secu-rity 2014.

[159] Q. Wang, Z. Lin, N. Borisov, and N. Hopper, “rbridge:User reputation based tor bridge distribution with pri-vacy preservation,” in NDSS 2013.

[160] Y. Cao and J. Yang, “Towards making systems forgetwith machine unlearning,” in SP 2015.

[161] Z. Huang, E. Ayday, J. Fellay, J.-P. Hubaux, andA. Juels, “Genoguard: Protecting genomic dataagainst brute-force attacks,” in SP 2015.

[162] R. Kumar, “Mining web logs: applications and chal-lenges,” in KDD 2009.

[163] R. Shokri, G. Theodorakopoulos, J.-Y. Le Boudec, andJ.-P. Hubaux, “Quantifying location privacy,” in SP2011.

[164] B. A. Prakash, N. Valler, D. Andersen, M. Faloutsos,and C. Faloutsos, “Bgp-lens: Patterns and anomaliesin internet routing updates,” in KDD 2009.

18