Upload
others
View
10
Download
0
Embed Size (px)
Citation preview
Adversarial Machine Learning:Big Data Meets Cyber Security
Bowei XiDepartment of Statistics
Purdue University
With Murat Kantarcioglu
IntroductionMany adversarial learning problems in practice.
— Image Classification/Object Recognition
— Intrusion Detection
— Fraud Detection
— Spam Detection
— Social Networks
Adversary adapts to avoid being detected.
New solutions are needed to address this challenge.
Million ways to write a spam email
From: "Ezra Martens" <ezrabngktbbem...
To: "Eleftheria Marconi" <clifton@pu...
Subject: shunless Phaxrrmaceutical
Date: Fri, 30 Sep 2005 04:49:10 -0500
Hello,
Easy Fast =
Best Home Total
OrdeShipPrricDelivConf
ringpingeseryidentiality
VIAAmbCIALevVALXan
GRAienLISitraIUMax
$ $ $$
3.33 1.21 3.75
Get =additional informmation attempted to
Understanding Adversarial Learning
It is not concept drift.
It is not online learning.
Adversary changes the objects under its control to avoid being de-
tected.
This is a game between the defender and the adversary.
Solution Ideas
Constantly update a learning algorithm
— Dalvi et. al., Adversarial classification, KDD 2004
Optimize worse case performance
— Globerson and Roweis, Nightmare at test time: robust learning
by feature selection, ICML 2006
Goals
— How to evaluate a defensive learning algorithm?
— How to construct a resilient defensive learning algorithm?
Game Theoretic Framework
Often a classifier is updated after observing adversaries actions, such
as spam filters.
Adversarial Stackelberg Game
Two players take sequential actions.
1. Adversary moves first by choosing a transformation T , fb(x) −→fTb (x)
2. After observing T , defender sets parameter values for a learning
algorithm and creates a defensive rule h(x).
f(x) = pgfg(x) + pbfTb (x). pg + pb = 1. Overall it is a mixture
distribution with two classes.
Game Theoretic Framework
— Adversary’s payoff is defined as
ub(T, h) =∫L(h,g)
g(T, x)fTb (x) dx.
g(T, x) is profit for a “bad” object x being classified as a “good”
one, after transformation T being applied. There is a penalty for
transformation. L(h, g) is the region where the objects are identified
as legitimate under decision rule h(x).
— Let C(T, h) be the misclassification cost. Defender’s payoff is
ug(T, h) = −C(T, h).
— hT (x) is defender’s best response facing T .
Game Theoretic Framework
— The adversary gain of applying transformation T is W (T ) =
ub(T, hT ).
W (T ) =∫L(hT ,g)
g(T, x)fTb (x)dx = EfTb
(I{L(hT ,g)}(x)g(T, x)).
– (T e, hT e) is a subgame perfect equilibrium, s.t.,
T e = argmaxT∈S ( W (T ) ) .
— After an equilibrium is reached, each player has no incentive to
change their action.
Game Theoretic Framework
Attribute selection is important in adversarial learning.
Example: Gaussian mixture distribution, Naive Bayes classifier, and
linear penalty for transformation.
πg πb Penalty Equi. Bayes ErrorX1 N(1,1) N(3,1) a = 1 0.16X2 N(1,1) N(3.5,1) a = 0.45 0.13X3 N(1,1) N(4,1) a = 0 0.23
Game Theoretic Framework
— Discretize the continuous numerical variables.
— Use the product of one dimensional marginal distributions to ap-
proximate the joint distribution.
— Search for an equilibrium using linear programming.
Case study: Lending Club Data
Peer to peer financing, offer small loans ranging from $1000 to
$25000.
Except for a credit report, many fields are easy to lie, such as job
information, purpose of the loan etc.
Game Theoretic Framework
We use the following attributes from the data set.
X1 – Amount requested
X2 – Loan purpose
X3 – Debt-to-income ratio
X4 – Home ownership (any, none, rent, own, mortgage)
X5 – Monthly income
X6 – FICO range
X7 – Open credit lines
X8 – Total credit lines
X9 – Revolving credit balance
X10 – Revolving line utilization
X11 – Inquiries in the last 6 months
X12 – Accounts now delinquent
X13 – Delinquent amount
X14 – Delinquencies in last 2 years
X15 – Months since last delinquency
Game Theoretic Framework
— The instances originally are classified as:
1) “Removed”; 2) “Loan is being issued”; 3) ”Late (31-120 days)”;
4) “Late (16-30 days)”; 5) “Issued”; 6) “In review”; 7) “In grace
period”; 8) “In funding”; 9) “Fully paid”; 10) “Expired”; 11) “De-
fault”; 12) “Current”; 13) “Charged Off”; 14) “New”.
— 5850 instances after data cleaning.
— pg = pb = 0.5.
— Numerical attributes are discretized using 10 equi-width buckets.
— Naive Bayes classifier with independent attributes.
— Attacker transforms each attribute independently.
Game Theoretic Framework
max
− m∑i=1
m∑j=1
fij × dij
+
(m∑i=1
I{yi<qi} × yi × ui
)subject to
fij ≥ 0 i = 1, . . . ,m, j = 1, . . . ,m (1)m∑j=1
fij = pi i = 1, ...,m (2)
m∑i=1
fij = yj j = 1, . . . ,m (3)
m∑j=1
yj = 1 (4)
pi is the prob. of x = wi given that instance is in bad class.qi.is the prob. of x = wi given that instance is in bad class.
fij is the proportion moved from wi to wj by the adversary.
dij is the penalty of transforming X from wi to wj.
ui is the expected profit.
yi is the prob. after transformation.
Game Theoretic Framework
— The worst case scenario has transformation cost 0.
— For penalized case, we assume improving an attribute for one level
costs $500 (results do not change in $300-$600 range).
— Attribute selection (choose top 6 attributes according to one dim
classification accuracy after transformation) works well.
— Equilibrium performance is a good indicator of the long term
success of a learning algorithm.
— Penalty prevents aggressive attacks.Experiment Type AccuracyInitial data set prior to transformation 89.95%Worst case scenario 61.25%Penalized transformation scenario 72.28%Attribute selection scenario 70.92%
Leader vs. Follower
How defender being a leader versus being a follower affects its equi-
librium strategies?
One defender with utility D(h).
m adversaries, each with utility Ai(ti).
h and ti are player strategies .
Leader vs. Follower
Defender being the leader
A one-leader-m-follower game.
1. Given a leader’s strategy h fixed, assume the m adversaries’ (i.e.,
the followers’) strategies are the attacks T = (t1, · · · , tm). For the
i-th adversary, further assume all other adversaries’ strategies are
fixed, i.e., fixed tj, ∀j 6= i. Solve the following optimization for thi :
thi = argmax{ti∈Si} {Ai(ti, h)}
2. With the solution from above, Th = (th1, · · · , thm) is the m ad-
versaries’ joint optimal attacks for a given defender strategy h, the
defender solves another optimization problem.
he = argmax{h∈H} {D(th1, · · · , thm, h)}
(he, the
1 , · · · , them) is an equilibrium strategy for all players in the game.
Leader vs. Follower
Defender being the follower
A m-leader-one-follower game.
1. Given the joint attacks from the m adversaries, T = (t1, · · · , tm),
solve for the defender’s optimal strategy.
hT = argmax{h∈H} {D(t1, · · · , tm, h)}
2.With the solution above as the defender’s optimal strategy hT
against joint attacks T = (t1, · · · , tm), solve for the optimal joint
attacks T e.
T e = (te1, ..., tem) = argmax{ti∈Si,∀i}
m∑i=1
Ai(ti, hT )
(hTe, te1, · · · , t
em) is an equilibrium strategy for all players in the game.
Leader vs. Follower
Gaussian Mixture Population
The defender controls the normal population Np(µg,Σg).
Each adversary controls a population Np(µbi ,Σbi).
For an attack t, adversarial objects move toward µg, the center of
the normal. Xt = µg + (1− t)× (X − µg).
0 ≤ t ≤ 1. t = 1 is the strongest attack.
Adversary population after attack is N((1− t)µb + tµg, (1− t)2Σb).
m adversaries’ strategy is the joint attack (t1, · · · , tm).
Defender’s strategy is to build a defensive wall around the center µg.
Leader vs. Follower
An Euclidean defensive wall is an ellipsoid shaped defensive region
(X − µ̂g)′Sg,−1(X − µ̂g) = χ2p(α).
0 < α < 1 controls the size.
It is an approximate confidence region for multivariate normal.
A Manhattan defensive wall forms a diamond shaped region.
p∑i=1
|Xi − µ̂gi |
σ̂gi
= η,
σ̂gi is the sample standard deviation. µ̂gi is the sample mean.
η controls the size.
Leader vs. Follower
Obtain η(α) as a function of α.
Using the sample mean vector µ̂g and var-cov matrix Sg from the
normal objects, generate a large sample from N(µ̂g, Sg). For an η(α),
(100α)% of the generated sample points fall into the diamond shaped
Manhattan region with vertices at µ̂gi ± η(α)σ̂gi on each dimension.
A defender’s strategy is to choose h = α or h = η(α).
Defender utility is D(α) or D(η(α)). Let c be misclassification cost.
D(α) = −100(error-of-normal + c× error-of-adversary).
L(t) = E(max{k − a× ||Xt −X||2,0}
), an adversary utility. k is max
utility of an adversarial object. ||Xt − X||2 measures how much an
adversary object is moved towards normal.
Do exhaustive search to find the equilibria of the games.
Leader vs. Follower
Euclidean defensive wall. Left is defender being the leader; right is
defender being the follower.
Leader vs. Follower
Manhattan defensive wall. Left is defender being the leader; right is
defender being the follower.
Adversarial Clustering
Previous adversarial learning work assumes the availability of large
amounts of labeled instances.
We develop a novel grid based adversarial clustering algorithm using
mostly unlabeled objects with a few labeled instances.
A classifier created using very few labeled objects is inaccurate.
We identify 1) the centers of normal objects, 2) sub-clusters of attack
objects, 3) the overlapping areas, 4) outliers and unknown clusters.
We draw defensive walls around the centers of the normal objects to
block out attack objects. The size of a defensive wall is based on
the previous game theoretic study.
Adversarial Clustering
It is not semi-supervised learning.
We operate under very different assumptions.
In adversarial settings, objects similar to each other belong to dif-
ferent classes, while objects in different clusters belong to the same
class.
Within a cluster, objects from two classes can overlap significantly.
Adversaries can bridge the gap between two previously well separated
clusters.
Semi-supervised learning assigns labels to all the unlabeled objects
aiming at the best accuracy.
We do not label the overlapping regions and outliers.
Adversarial Clustering
We have nearly purely normal objects inside the defensive walls, de-
spite an increased Bayes error.
Like in airport security. A small number of passengers use the fast
pre-check lane, while all other passengers go through security check.
The goal is not to let a single terrorist enter an airport, at a cost of
blocking out many normal objects.
A classification boundary is a defensive wall, analogous to a point es-
timate, while the overlapping areas analogous to confidence regions.
Adversarial Clustering
A Grid Based Adversarial Clustering Algorithm
—Initialization: Set parameter values
—Merge 1: Creating labeled normal and abnormal sub-clusters
—Merge 2: Clustering the remaining unlabeled data points
—Merge 3: Merge all the data points without considering the labels
—Match: Match the big unlabeled clusters with the normal and ab-
normal clusters, and the clusters of the remaining unlabeled data
points from the first pass.
—Draw α-level defensive walls inside the normal regions
A weight parameter k controls the size of the overlapping areas.
Adversarial Clustering
Compare with semi-supervised learning, EM least squares and S4VM.
α = 0.6. True labels below. Blue is for normal; orange is for abnor-
mal; purple for unlabeled; yellow and black for outliers.
−2
−1
0
1
2
−2 −1 0 1 2
−2
0
2
4
0 1 2 3 4
Adversarial Clustering
−2
−1
0
1
2
−2 −1 0 1 2
−2
−1
0
1
2
−2 −1 0 1 2
Exp.1. Left is ADClust with k = 10; right is ADClust with k = 20.
Adversarial Clustering
−2
−1
0
1
2
−2 −1 0 1 2
−2
−1
0
1
2
−2 −1 0 1 2
Exp.1. Left is EM least square; right is S4VM
Adversarial Clustering
−2
0
2
4
0 1 2 3 4
−2
0
2
4
0 1 2 3 4
Exp.2. Left is ADClust with k = 10; right is ADClust with k = 20
Adversarial Clustering
−2
0
2
4
0 1 2 3 4
−2
0
2
4
0 1 2 3 4
Exp.2. Left is EM least square; right is S4VM
Adversarial Clustering
Experiment 1 has an attack taking place between two normal clusters.
Experiment 2 has an unknown cluster, with normal and abnormal
clusters heavily mixed under a strong attack.
Our algorithm found the core areas of the normal.
Semi-supervised wrongly labeled a whole normal cluster in experiment
1, and labeled the unknown cluster as normal in experiment 2.
Smaller k is more conservative, creating larger unlabeled mixed areas.
Adversarial Clustering
KDD Cup 1999 Network Intrusion Data
Around 40 percent are network intrusion instances.
We use 25192 instances from training data and top 7 continuous
features.
Experiment 1 has 100 runs. In one run, 150 instances are randomly
sampled with their labels. 99.4% become unlabeled.
Experiment 2 has 100 runs. In one run, 100 instances are randomly
sampled with their labels. 99.6% become unlabeled.
Larger k produces less unlabeled points, bigger normal regions con-
taining more attack instances. More aggressive.
Adversarial Clustering
0 20 40 60 80 1000.7400.7420.7440.7460.7480.750
0 20 40 60 80 1000.73
0.74
0.75
0 20 40 60 80 100500
520
540
560
0 20 40 60 80 1003500
4000
4500
5000
Exp.1. Increase the weight k from 1 to 100. Top left is percent of abnormal
points in mixed region;bottom left is percent of abnormal points among outliers;
top right is the number of points in mixed region; and bottom right is the number
of points as outliers.
Adversarial Clustering
��� �� �� ��� ���� ������
��
���
���
���
���
��
����
����
�����
��� �� �� ��� ���� ������
��
���
���
���
���
��
����
����
������
��� �� �� ��� ���� ������
��
���
���
���
���
��
����
����
������
������������������������
��� �� �� ��� ���� ������
��
���
���
���
���
��
����
����
����
�
��� �� �� ��� ���� ������
���
���
���
���
��
����
����
����
��
��� �� �� ��� ���� ������
���
���
���
���
��
����
����
����
��������������������������
Exp.2. k equals to 1, 30 and 50 as low, medium and high weights. Set α levels
from 0.6 to 0.95. KDD data is highly mixed, yet we achieve on average nearly
90% pure normal rate inside the defensive walls
Adversarial SVM
Classification boundaries. + is for the untransformed “bad” objects; o is for the
“good” objects; * is for the transformed “bad” objects, i.e., the attack objects.
The black dashed line is the standard SVM classification boundary, and the blue
line is the Adversarial SVM classification boundary. Both the untransformed and
the transformed “bad” objects are what we want to detect and block.
Adversarial SVM
AD-SVM solves a convex optimization problem where the constraints
are tied to adversarial attack models.
Free-range attack: Adversary can move attack objects anywhere
in the domain.
Cf(xmin.j − xij) ≤ δij ≤ Cf(xmax.j − xij)
The jth feature of an instance xi falls in between xmax.j and xmin.j .
Cf ∈ [0,1]. Cf = 0 means no attack. Cf = 1 is the most aggressive
attack.
Adversarial SVM
Targeted attack: Adversary can only move attack instances closer
to a targeted value.
xti is the target. The adversary adds δij to the j−th feature xij of an
attack object. |δij| ≤ |xtij − xij|.
An upper bound on the amount of movement for xij:
0 ≤ (xtij − xij)δij ≤ Cξ
1− Cδ|xtij − xij||xij|+ |xtij|
(xtij − xij)2
Cδ reflects the loss of malicious utility. It can be larger than 1.
1 − Cδ|xtij−xij||xij|+|xtij|
is the maximum percentage of xtij − xij that δij is
allowed to be.
Cξ ∈ [0,1] is a discount factor. Larger Cξ allows more movement.
Adversarial SVM
SVM risk minimization model: free-range attack
argminw,b,ξi,ti,ui,vi12||w||
2 + C∑i ξi
s.t. ξi ≥ 0ξi ≥ 1− yi · (w · xi + b) + titi ≥
∑j Cf
(vij(x
maxj − xij)− uij(xminj − xij)
)ui − vi = 1
2(1 + yi)wui � 0vi � 0
SVM risk minimization model: targeted attack
argminw,b,ξi,ti,ui,vi12||w||
2 + C∑i ξi
s.t. ξi ≥ 0ξi ≥ 1− yi · (w · xi + b) + titi ≥
∑j eijuij
(−ui + vi) ◦ (xti − xi) = 12(1 + yi)w
ui � 0vi � 0
Adversarial SVM
spambase data is from the UCI data repository. Accuracy of AD-
SVM, SVM, and one-class SVM on the spambase data as free range
attacks intensify. Cf increases as attacks become more aggressive.
Generate attacks as δij = fattack(xtij − xij)
fattack = 0 fattack = 0.3 fattack = 0.5 fattack = 0.7 fattack = 1.0AD-SVM Cf = 0.1 0.882 0.852 0.817 0.757 0.593AD-SVM Cf = 0.3 0.880 0.864 0.833 0.772 0.588AD-SVM Cf = 0.5 0.870 0.860 0.836 0.804 0.591AD-SVM Cf = 0.7 0.859 0.847 0.841 0.814 0.592AD-SVM Cf = 0.9 0.824 0.829 0.815 0.802 0.598
SVM 0.881 0.809 0.742 0.680 0.586One-Class SVM 0.695 0.686 0.667 0.653 0.572
Adversarial SVM
Cδ decreases as targeted attacks become more aggressive. Cξ = 1
fattack = 0 fattack = 0.3 fattack = 0.5 fattack = 0.7 fattack = 1.0AD-SVM Cδ = 0.9 0.874 0.821 0.766 0.720 0.579AD-SVM Cδ = 0.7 0.888 0.860 0.821 0.776 0.581AD-SVM Cδ = 0.5 0.874 0.860 0.849 0.804 0.586AD-SVM Cδ = 0.3 0.867 0.855 0.845 0.809 0.590AD-SVM Cδ = 0.1 0.836 0.840 0.839 0.815 0.597
SVM 0.884 0.812 0.761 0.686 0.591One-class SVM 0.695 0.687 0.676 0.653 0.574
AD-SVM is more resilient to modest attacks than other SVM learning
algorithms.
Funding Support
— ARO W911NF-17-1-0356: Data Analytics for Cyber Security: Defeating the
Active Adversaries, UT Dallas PI: M. Kantarcioglu, Purdue PI: B. Xi, $470,000,
08/07/2017 – 08/06/2020, Amount Responsible: $222,500
— ARO W911NF-12-1-0558: A Game Theoretic Framework for Adversarial Clas-
sification, UT Dallas PI: M. Kantarcioglu, Purdue PI: B. Xi, $440,000, 08/01/2012
– 07/31/2015, Amount Responsible: $210,140
Publications
— Kantarcioglu, M., Xi, B., and Clifton, C., Classifier Evaluation and Attribute
Selection against Active Adversaries, Data Mining and Knowledge Discovery. 22(1-
2), 291-335, 2011
— Zhou, Y., Kantarcioglu, M., Thuraisingham, B., and Xi, B., Adversarial Support
Vector Machine Learning, Proceedings of the 18th ACM SIGKDD International
Conference on Knowledge Discovery and Data Mining, 1059–1067, KDD 2012
— Kantarcioglu, M.∗, and Xi, B.∗, Adversarial Data Mining, North Atlantic Treaty
Organization (NATO) SAS-106 Symposium on Analysis Support to Decision Mak-
ing in Cyber Defense, 1–11, June, 2014, Estonia
— Zhou, Y., Kantarcioglu, M., Xi, B., (Hot Topic Essay) Adversarial Learning:
Mind Games with the Opponent, ACM Computing Reviews, August 2017
— Zhou, Y., Kantarcioglu, M., Xi, B., A Survey of Game Theoretic Approach for
Adversarial Machine Learning, revision submitted
— Wei, W., Xi, B., Kantarcioglu, M., Adversarial Clustering: A Grid Based Clus-
tering Algorithm against Active Adversaries, submitted
— Zhou, Y., Kantarcioglu, M., Xi, B., Adversarial Active Learning, submitted