Upload
ngokhuong
View
216
Download
0
Embed Size (px)
Citation preview
Towards Scalable Cluster Auditing through Grammatical Inference
over Provenance Graphs WajihUlHassan,MarkLemay,NurainiAguse,
AdamBates,ThomasMoyer
NDSS Symposium 2018 Feb 20, 2018
Notable Data Breach in 2017
3
Equifax Data Breach Timeline 2017
apr may jun jul aug sep oct
Breached Detected
Hackers in Equifax Servers
Patched
Breached Announced
Notable Data Breach in 2017
4
Equifax Data Breach Timeline 2017
apr may jun jul aug sep oct
Breached Detected
Hackers in Equifax Servers
Patched
Breached Announced
3 Months of crucial attack audit logs
Notable Data Breach in 2017
5
Equifax Data Breach Timeline 2017
apr may jun jul aug sep oct
Breached Detected
Hackers in Equifax Servers
Patched
Breached Announced
3 Months of crucial attack audit logs Are current auditing systems scalable?
Data Provenance aka Audit log � Lineage of system activities � Represented as Directed Acyclic Graph (DAG) � Used for forensic analysis
6
1. Bash,SpawnsNGINX2. NGINX,Receivesfromabc.com3. NGINX,ReadsFileindex.html4. ….......
index.html
NGINX
abc.com
Auditlog
Bash
ProvenanceGraph
Bash:exec(“./NGINX”);NGINX:recv(…,“abc.com”);fread(“index.html”);
CodeExecution
Data Provenance in a Cluster
7
WorkerNodes
MasterNode
Centralized auditing not practical due to two
limitations
Limitation#1: Graph Complexity � NGINX and MySQL running for 5 mins on a single machine
8
Finding needle in a haystack problem
Limitation#2: Storage overhead � Leads to network overhead as logs are transferred to master
node
9
[VALUE]GB
[VALUE]GB
[VALUE]GB
[VALUE]GB
0
2
4
6
8
10
12
14
Day1 Day2 Day3 Day4 Day5
LogSize(G
B)
AuditLogSizeGrowthforaSingleNGINXserver
Uncompressed Compressed
2.54GB/Dayonasinglemachine
Withcompression>1GB/Day
Winnower
� Cluster applications are replicated in accordance with microservice architecture principle
� Replicated apps produce highly homogeneous provenance graphs � core execution behaviour is similar
10
Key Idea: Remove redundancy from provenance graphs across cluster before sending to master node
11
Before
/up/*
NGINX
*
Bash
mysql
*
mysqld
/db/*
Master Node View with Winnower
After
Otherlibraryfilevertices
Winnower � Build consensus model across cluster using graph grammars � Like string grammar, graph grammars provide rule-based
mechanisms � For generating, manipulating and analyzing graphs � Induction – produce grammar from a given graph � Parsing – membership test of a given graph is in a grammar
12
a a
t t t
b b b
a
S ≔ A T
A ≔ a B
T t
B≔
b
S
S ≔ e
a e
t
b
Graph GraphGrammar
Architecture
13
AuditModule
Prov.Graph
Worker Nodes
ModelAggregator
Master Node
Worker Node
Fetchgraphateachepoch
——
——
——
—— —
—
——
——
———
—
Fine-grained Graph
Abstracted Graph
Graph
Abstraction
ModelGraph
Graph
Induction
WinnowerAgent
Modelgraphs/grammarsfromcluster
Architecture
14
AuditModule
ModelAggregator
——
——
——
—— —
—
——
——
———
—
Fine-grained Graph
Abstracted Graph
Graph
Abstraction
ModelGraph
Graph
Induction
WinnowerAgent
AggregatedModelOnlysendModelupdates
Worker Nodes
Master Node
Worker Node
Prov.Graph
Fetchgraphatnextepoch
Architecture
15
AuditModule
ModelAggregator
WinnowerAgent
QuerypartofProvenancegraphHigh-fidelity
Provenancegraph
Worker Nodes
Master Node
Worker Node
Provenance Graph Abstraction � Graph Induction process builds a model/grammar that concisely
describe the whole graph � However, instance-specific fields frustrate any attempts to build a
generic application behaviour model
16
NoGeneralmodelasinstancespecificinformationsuchPIDisdifferentamonggraphs
ftppid:2788
ftp workerpid:2797
192.168.0.2
ftp listenerpid:2789
192.168.0.1
ftp listenerpid:2791
ftp workerpid:2795
192.168.0.2192.168.0.1 /up/File1Inode:3
/up/File2Inode:5
ftppid:2780
Node 1 Node 2
GraphInduction
Provenance Graph Abstraction � Provenance graph vertices have well defined fields
� E.g. pid:1234 , FilePath:/etc/ld.so
� Defined rules manually that remove or generalize these fields
ftppid:2788
ftp workerpid:2797
192.168.0.2
ftp listenerpid:2789
192.168.0.1
ftp listenerpid:2791
ftp workerpid:2795
192.168.0.2192.168.0.1 /up/File1Inode:3
/up/File2Inode:5
ftppid:2780
Node 1 Node 2ftp
ftp worker
192.168.0.0/24
ftp listener
192.168.0.0/24
ftp listener
ftp worker
192.168.0.0/24192.168.0.0/24/up/* /up/*
ftp
Node 2Node 1
GraphAbstraction
Provenance Graph Induction � Deterministic Finite Automata (DFA) Learning to generate grammar
� Encodes the causality in generated models � In DFA learning the present state of a vertex includes the path taken
to reach the vertex (provenance ancestry) � Winnower extends it to remember descendants (provenance progeny)
� State of each vertex consist of three items: 1. Label 2. Provenance ancestry 3. Provenance progeny
18
File1.txt
gzip
Bash
File1.txt
Progenyofgzipvertex
Ancestryofgzipvertex
Provenance Graph Induction � Finds repetitive patterns using standard implicit and explicit
state merging algorithm � Implicit state merging combines two subgraphs if states of each
vertex are same in both subgraphs
19
ftp
ftp worker
192.168.0.0/24
ftp listener
192.168.0.0/24
ftp listener
ftp worker
192.168.0.0/24192.168.0.0/24/up/* /up/*
ftp
Node 2Node 1 ftp192.168.0.0/24
ftp listener
ftp worker
192.168.0.0/24 /up/*
Confidence levelLegend 2
GraphInduction
java
javamapper
data
javareducer
java
java mapper
data
javareducer
Node 1
Explicit State Merging
20Mergetwonodes
� At high-level explicit state merging � Picks two nodes and make their states same � Check if subgraph can be merged implicitly
� Consider a chained map reduce job
java
java mapper
data
javareducer
java
java mapper
data
javareducer
Node 1
:=SS:=AS:=T|VT:=A->X|A->YX:=B->WY:=C->WW:=D|D->SA:=dataB:=javamapperC:=javareducerD:=java
GraphGrammar
A
B
D
C
Provenance Graph Induction � Consider a graph with a malicious activity � Malicious behavior is visible in the final model
21
GraphInduction
ftp
ftp worker
ftp listener
ftp listener
ftp worker/up/*
/up/*
bash
Malicious filewget
x.x.x.x
ftp
Node 1
ftp ftp listener
ftp worker/up/*
Node 3
Node 2ftp
ftp listener
ftp worker
/up/*
bash
Malicious file
wget
x.x.x.x
Confidence levelLegend 1 3
Master Node
Evaluation Setup
� Setup � 1 VM as master node, 4 VMs as worker nodes � SPADE and Docker Swarm � Epoch size 50 sec
� Metrics � Storage Overhead � Computational Cost � Effectiveness
22
Storage Overhead on Master Node
23
0 100 200 300 400 500 600 700
HTTPD
ProFTPD
MySQL
485
630
130
0.11
0.12
0.17
LOGSIZEINMB
Winnower Raw
98.7%decrease
Storage Reduction on Master Node
24
� Apache Webserver with moderate workload
� Note the log scale on y-axis
1
10
100
1000
10000
100000
50 100 150 200 250 300 350 400 450 500 550 600
LOG
SIZ
E (M
B)
TIME (SEC)
Raw(Uncompressed) Raw(Compressed) Winnower
7zcompressionisnotsuitable:• Noglobalviewofcluster• Oblivioustopreviousbatch
Evaluation: Computation Cost
25
� Average time spent in induction and membership test at each epoch
0
5
10
15
20
25
30
35
50 100 150 200 250 300 350 400 450 500 550 600
AverageTime(sec)
ElapsedTime(sec)
Apache MySQL ProFTPD
HeterogeneousWorkload->
Updatesmodel
GenerateModelforfirsttime
Membershipcheckinexistingmodel
Case Study: Ransomware Attack
26
• Attacker exploits Redis database server vulnerability version < 3.2
• Vulnerability allows attacker to change SSH key and log in as Root
• Attacker deletes the database and left a note using vim to send bitcoins get database back
Traditional Graph of Attack
27
� 10 instances of redis running in the cluster � ~80k vertices and ~83K edges with 161 MB size � Part of provenance graph shown below
28
Winnower Generated Provenance graph
� 54 vertices and 68 edges with 0.7 MB size � Part of graph is shown below:
Worker
* /uploads/*
redis-server
x.x.x.x
Attack Provenance
Nginx
*
bash
/root/.ssh/authorized_keys
*
172.17.0.0/24
/var/lib/redis/dump.rdb
/proc/12743/stat
/var/log/redis/redis.log
x.x.x.x sshd bash
/root/ransomware.notevim
/dev/tty
Other library files
Confidence levelLegend 1 10
29
Winnower Generated Provenance graph
� What happens if we attack all the nodes in the cluster
Worker
* /uploads/*
redis-server
x.x.x.x
Attack Provenance
Nginx
*
bash
/root/.ssh/authorized_keys
*
172.17.0.0/24
/var/lib/redis/dump.rdb
/proc/12743/stat
/var/log/redis/redis.log
x.x.x.x sshd bash
/root/ransomware.notevim
/dev/tty
Other library files
Confidence levelLegend 10
� Winnower is the first practical system for provenance-based auditing of clusters at scale with low overhead
� Winnower significantly improves attack identification and investigation in a large cluster
30
Conclusion
Threat model
� Assumptions � Winnower only tracks user-space attacks i.e. trusts the OS � Log integrity is maintained
� Attack surface � Distributed application replicated on Worker nodes
� Attacker’ motive � Gain control over worker node by exploiting a software vulnerability in
the distributed application
33