77
High Accuracy A,ack Provenance via BinaryBased ExE cu9on P ar99on (BEEP) Kyu Hyung Lee, Xiangyu Zhang, Dongyan Xu NDSS 2013, Feb 25 27

High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

High  Accuracy  A,ack  Provenance  via  Binary-­‐Based  ExEcu9on  Par99on  

(BEEP)  

Kyu  Hyung  Lee,      Xiangyu  Zhang,  Dongyan  Xu  

NDSS  2013,  Feb  25  -­‐  27  

Page 2: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

Introduc9on  §  Attack provenance is important �•  Determine the “entry point” of an attack �•  Understand the damage to the victim system�

Page 3: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

Introduc9on  §  Previous approaches�•  Generate causal graph by detecting system

level dependences�

Page 4: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

Introduc9on  §  Previous approaches�•  Generate causal graph by detecting system

level dependences�− Process <-> File�

File  1  

Process  A  

Page 5: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

Introduc9on  §  Previous approaches�•  Generate causal graph by detecting system

level dependences�− Process <-> File�− Process -> Process�

File  1  

Process  A  

Process  B  

Page 6: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

Introduc9on  §  Previous approaches�•  Generate causal graph by detecting system

level dependences�− Process <-> File�− Process -> Process�− Process <-> Socket �

File  1  

Process  A  

Socket  

Process  B  

Page 7: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

Introduc9on  §  Previous approaches�•  Generate causal graph by detecting system

level dependences�− Process <-> File�− Process -> Process�− Process <-> Socket �

•  Limitations �− Dependence explosion problem�− Granularity of “process” is too coarse�

File  1  

Process  A  

Socket  

Process  B  

Page 8: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim
Page 9: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

Malware  

Page 10: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

Firefox  

Malware  

Page 11: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

.  .  .  

Firefox  

Malware  

The user visited 11 web sites Dependence explosion!!

(229 IP Addresses)

Page 12: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

Firefox  

Pine  

Malware  

Page 13: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

.  .  .  

Firefox  

Pine  

Malware  

message1   message2   message13   message14  .  .  .  

sendmail  

procmail  

sendmail  

procmail  .  .  .  

sendmail  

procmail  

sendmail  

procmail  

sendmail  

.  .  .  

Page 14: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

.  .  .  

Firefox  

Pine  

Malware  

message1   message2   message13   message14  .  .  .  

sendmail  

procmail  

sendmail  

procmail  .  .  .  

sendmail  

procmail  

sendmail  

procmail  

sendmail  

.  .  .  

Dependence  Explosions  :  51  Processes,  15  Files,    251  Network  addresses,    

351  Edges    

Page 15: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

.  .  .  

Firefox  

Pine  

Malware  

message1   message2   message13   message14  .  .  .  

sendmail  

procmail  

sendmail  

procmail  .  .  .  

sendmail  

procmail  

sendmail  

procmail  

sendmail  

.  .  .  

Page 16: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

.  .  .  

Firefox  

Pine  

Malware  

message1   message2   message13   message14  .  .  .  

sendmail  

procmail  

sendmail  

procmail  .  .  .  

sendmail  

procmail  

sendmail  

procmail  

sendmail  

.  .  .  

Page 17: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

.  .  .  

Firefox  

Pine  

Malware  

message1   message2   message13   message14  .  .  .  

sendmail  

procmail  

sendmail  

procmail  .  .  .  

sendmail  

procmail  

sendmail  

procmail  

sendmail  

.  .  .  

Page 18: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

.  .  .  

Firefox  

Pine  

Malware  

message1   message2   message13   message14  .  .  .  

sendmail  

procmail  

sendmail  

procmail  .  .  .  

sendmail  

procmail  

sendmail  

procmail  

sendmail  

.  .  .  

Page 19: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

.  .  .  

Firefox  

Pine  

Malware  

message1   message2   message13   message14  .  .  .  

sendmail  

procmail  

sendmail  

procmail  .  .  .  

sendmail  

procmail  

sendmail  

procmail  

sendmail  

.  .  .  

Page 20: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

.  .  .  

Firefox  

Pine  

Malware  

message1   message2   message13   message14  .  .  .  

sendmail  

procmail  

sendmail  

procmail  .  .  .  

sendmail  

procmail  

sendmail  

procmail  

sendmail  

.  .  .  

Page 21: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

BEEP  :  Our  Approach  

§  Finer-grained subject : Execution “UNIT”�•  Dynamically partition the execution of a

process into autonomous execution segments�

Long  running  process  

Page 22: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

BEEP  :  Our  Approach  

§  Finer-grained subject : Execution “UNIT”�•  Dynamically partition the execution of a

process into autonomous execution segments�

U6  U2   U3   U4   U5  U1  Long  running  process  

Page 23: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

BEEP  :  Our  Approach  

§  Finer-grained subject : Execution “UNIT”�•  Dynamically partition the execution of a

process into autonomous execution segments�

U6  U2   U3   U4   U5  U1  

Symptom  

Long  running  process  

Page 24: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

BEEP  :  Our  Approach  

§  Finer-grained subject : Execution “UNIT”�•  Dynamically partition the execution of a

process into autonomous execution segments�

U6  U2   U3   U4   U5  U1  

Symptom  

Long  running  process  

Page 25: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

BEEP  :  Our  Approach  

§  Finer-grained subject : Execution “UNIT”�•  Dynamically partition the execution of a

process into autonomous execution segments�•  Units are not always independent �− Detect causality between units�

U6  U2   U3   U4   U5  U1  

Symptom  

Long  running  process  

Page 26: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

BEEP  :  Our  Approach  

§  Finer-grained subject : Execution “UNIT”�•  Dynamically partition the execution of a

process into autonomous execution segments�•  Units are not always independent �− Detect causality between units�

U6  U2   U3   U4   U5  U1  

Symptom  

Long  running  process   U5  

Page 27: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

.  .  .  

Firefox  

Pine  

Malware  

message1   message2   message13   message14  .  .  .  

sendmail  

procmail  

sendmail  

procmail  .  .  .  

sendmail  

procmail  

sendmail  

procmail  

sendmail  

.  .  .  

Page 28: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

.  .  .  

Firefox  

Pine  

Malware  

message1   message2   message13   message14  .  .  .  

sendmail  

procmail  

sendmail  

procmail  .  .  .  

sendmail  

procmail  

sendmail  

procmail  

sendmail  

.  .  .  

Page 29: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

.  .  .  

Firefox  

Pine  

Malware  

message1   message2   message13  .  .  .  

sendmail  

procmail  

sendmail  

procmail  .  .  .  

sendmail  

procmail  

sendmail  

.  .  .  

Malware  

message14  

sendmail  

procmail  

Page 30: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

x.x.x.x  

Firefox  

Pine  

Malware  

message14  

sendmail  

procmail  

sendmail  

y.y.y.y  

Page 31: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

BEEP  :    10  Processes,  2  Files,    

6  Network  addresses,  23  Edges    

Previous  Approach  :    51  Processes,  15  Files,    

251  Network  addresses,  351  Edges    

16.3  9mes  smaller  

Page 32: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

BEEP  :  Our  Approach  

§  BEEP : Binary-based ExEcution Partition �•  Without source code�•  Reverse engineer units and unit dependences�•  Negligible runtime overhead (1.4% max.)�•  Low space overhead (12.28% avg.)�•  Effective (40.93 times smaller causal graph)�

Page 33: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

Outline  

§  Identifying UNITs�

§  Inter-UNIT Dependences�

§  How BEEP Works�

§  Experimental Results�

§  Conclusion �

Page 34: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

Outline  

§  Identifying UNITs�

§  Inter-UNIT Dependences�

§  How BEEP Works�

§  Experimental Results�

§  Conclusion �

Page 35: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

Iden9fying  Units  

Page 36: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

Iden9fying  Units  

     Characteris9cs  of  long  running  programs  

     ✔  Driven  by  external  requests        ✔  Dominated  by  event  processing  loops  

Page 37: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

Iden9fying  Units  §  Event processing loop �

Page 38: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

Iden9fying  Units  

Execution Unit :

Iteration of event processing loop

§  Event processing loop �

Page 39: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

Iden9fying  Units  §  Detecting Unit Loops �•  High level loop �

Page 40: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

Iden9fying  Units  §  Detecting Unit Loops �•  High level loop �

Page 41: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

Iden9fying  Units  §  Detecting Unit Loops �•  High level loop �•  Receive inputs and produce outputs �

accept(…)  recvfrom(…)  read(…)  write(…)  sendto(…)  close(…)  

Page 42: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

Outline  

§  Identifying UNITs�

§  Inter-UNIT Dependences�

§  How BEEP Works�

§  Experimental Results�

§  Conclusion �

Page 43: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

Inter-­‐Unit  Dependences  

§  A unit alone may not correspond to a semantically independent sub-execution �

Page 44: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

void *worker_thread(..) {"

while(!worker_may_exit) {"

req=ap_queue_pop();"

process_request(req);"

} // while end

} // worker_thread end

820 … 842 … 862 … 894 … 899 … 906

<Apache-2.2.21> - Multi-thread

void *listener_thread(..) {"

while(1) {"

req=accept_func(..);"

ap_queue_push(req);"

} // while end

} // listener_thread end

593 … 631 … 742 … 768 … 798 … 810

Inter-­‐Unit  Dependences  

Page 45: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

Inter-­‐Unit  Dependences  

void *worker_thread(..) {"

while(!worker_may_exit) {"

req=ap_queue_pop();"

process_request(req);"

} // while end

} // worker_thread end

820 … 842 … 862 … 894 … 899 … 906

<Apache-2.2.21> - Multi-thread

void *listener_thread(..) {"

while(1) {"

req=accept_func(..);"

ap_queue_push(req);"

} // while end

} // listener_thread end

593 … 631 … 742 … 768 … 798 … 810

Page 46: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

Inter-­‐Unit  Dependences  

void *worker_thread(..) {"

while(!worker_may_exit) {"

req=ap_queue_pop();"

process_request(req);"

} // while end

} // worker_thread end

820 … 842 … 862 … 894 … 899 … 906

<Apache-2.2.21> - Multi-thread

void *listener_thread(..) {"

while(1) {"

req=accept_func(..);"

ap_queue_push(req);"

} // while end

} // listener_thread end

593 … 631 … 742 … 768 … 798 … 810

Page 47: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

Inter-­‐Unit  Dependences  

void *worker_thread(..) {"

while(!worker_may_exit) {"

req=ap_queue_pop();"

process_request(req);"

} // while end

} // worker_thread end

820 … 842 … 862 … 894 … 899 … 906

<Apache-2.2.21> - Multi-thread

void *listener_thread(..) {"

while(1) {"

req=accept_func(..);"

ap_queue_push(req);"

} // while end

} // listener_thread end

593 … 631 … 742 … 768 … 798 … 810

Page 48: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

void *worker_thread(..) {"

while(!worker_may_exit) {"

req=ap_queue_pop();"

process_request(req);"

} // while end

} // worker_thread end

820 … 842 … 862 … 894 … 899 … 906

<Apache-2.2.21> - Multi-thread

void *listener_thread(..) {"

while(1) {"

req=accept_func(..);"

ap_queue_push(req);"

} // while end

} // listener_thread end

593 … 631 … 742 … 768 … 798 … 810

Inter-­‐Unit  Dependences  

Page 49: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

Inter-­‐Unit  Dependences  

void *worker_thread(..) {"

while(!worker_may_exit) {"

req=ap_queue_pop();"

process_request(req);"

} // while end

} // worker_thread end

820 … 842 … 862 … 894 … 899 … 906

<Apache-2.2.21> - Multi-thread

void *listener_thread(..) {"

while(1) {"

req=accept_func(..);"

ap_queue_push(req);"

} // while end

} // listener_thread end

593 … 631 … 742 … 768 … 798 … 810

Page 50: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

void *worker_thread(..) {"

while(!worker_may_exit) {"

req=ap_queue_pop();"

process_request(req);"

} // while end

} // worker_thread end

820 … 842 … 862 … 894 … 899 … 906

<Apache-2.2.21> - Multi-thread

void *listener_thread(..) {"

while(1) {"

req=accept_func(..);"

ap_queue_push(req);"

} // while end

} // listener_thread end

593 … 631 … 742 … 768 … 798 … 810

Inter-­‐Unit  Dependences  

Page 51: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

Inter-­‐Unit  Dependences  

§  A unit alone may not correspond to a semantically independent sub-execution �•  A few units together compose an

autonomous sub-execution �

req  =  accept();  push_queue(req);  

UNIT_Listener  req  =  pop_queue();  process(req);  send_result();  

UNIT_Worker  

Page 52: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

Inter-­‐Unit  Dependences  

§  A unit alone may not correspond to a semantically independent sub-execution �•  A few units together compose an

autonomous sub-execution �

•  Dependences through workflow object �− Ex) queue – enqueue, dequeue�

req  =  accept();  push_queue(req);  

UNIT_Listener  req  =  pop_queue();  process(req);  send_result();  

UNIT_Worker  

Page 53: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

Inter-­‐Unit  Dependences  

req1  =  accept();  push_queue(req1);  

UNIT_Listener_1  req1  =  pop_queue();  process(req1);  update(log_buf);  

UNIT_Worker_1  

req2  =  accept();  push_queue(req2);  

UNIT_Listener_2  req2  =  pop_queue();  process(req2);  update(log_buf);  

UNIT_Worker_2  

req3  =  accept();  push_queue(req3);  

UNIT_Listener_3  req3  =  pop_queue();  process(req3);  update(log_buf);  

UNIT_Worker_3  

Page 54: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

Inter-­‐Unit  Dependences  

req1  =  accept();  push_queue(req1);  

UNIT_Listener_1  req1  =  pop_queue();  process(req1);  update(log_buf);  

UNIT_Worker_1  

req2  =  accept();  push_queue(req2);  

UNIT_Listener_2  req2  =  pop_queue();  process(req2);  update(log_buf);  

UNIT_Worker_2  

req3  =  accept();  push_queue(req3);  

UNIT_Listener_3  req3  =  pop_queue();  process(req3);  update(log_buf);  

UNIT_Worker_3  

Workflow  Dependences  •  Represent  high-­‐level  program  workflow  •  Detect  semanXcally  relevant  units  

Page 55: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

Inter-­‐Unit  Dependences  

req1  =  accept();  push_queue(req1);  

UNIT_Listener_1  req1  =  pop_queue();  process(req1);  update(log_buf);  

UNIT_Worker_1  

req2  =  accept();  push_queue(req2);  

UNIT_Listener_2  req2  =  pop_queue();  process(req2);  update(log_buf);  

UNIT_Worker_2  

req3  =  accept();  push_queue(req3);  

UNIT_Listener_3  req3  =  pop_queue();  process(req3);  update(log_buf);  

UNIT_Worker_3  

Page 56: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

Inter-­‐Unit  Dependences  

req1  =  accept();  push_queue(req1);  

UNIT_Listener_1  req1  =  pop_queue();  process(req1);  update(log_buf);  

UNIT_Worker_1  

req2  =  accept();  push_queue(req2);  

UNIT_Listener_2  req2  =  pop_queue();  process(req2);  update(log_buf);  

UNIT_Worker_2  

req3  =  accept();  push_queue(req3);  

UNIT_Listener_3  req3  =  pop_queue();  process(req3);  update(log_buf);  

UNIT_Worker_3  

Page 57: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

Inter-­‐Unit  Dependences  

req1  =  accept();  push_queue(req1);  

UNIT_Listener_1  req1  =  pop_queue();  process(req1);  update(log_buf);  

UNIT_Worker_1  

req2  =  accept();  push_queue(req2);  

UNIT_Listener_2  req2  =  pop_queue();  process(req2);  update(log_buf);  

UNIT_Worker_2  

req3  =  accept();  push_queue(req3);  

UNIT_Listener_3  req3  =  pop_queue();  process(req3);  update(log_buf);  

UNIT_Worker_3  

         Low  Level  Dependences  •  Not  part  of  the  workflow  •  Caused  by  low  level  behavior  •  Induce  arbitrary  dependences  

Page 58: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

Inter-­‐Unit  Dependences  

req1  =  accept();  push_queue(req1);  

UNIT_Listener_1  req1  =  pop_queue();  process(req1);  update(log_buf);  

UNIT_Worker_1  

req2  =  accept();  push_queue(req2);  

UNIT_Listener_2  req2  =  pop_queue();  process(req2);  update(log_buf);  

UNIT_Worker_2  

req3  =  accept();  push_queue(req3);  

UNIT_Listener_3  req3  =  pop_queue();  process(req3);  update(log_buf);  

UNIT_Worker_3  

Page 59: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

Inter-­‐Unit  Dependences  

req1  =  accept();  push_queue(req1);  

UNIT_Listener_1  req1  =  pop_queue();  process(req1);  update(log_buf);  

UNIT_Worker_1  

req2  =  accept();  push_queue(req2);  

UNIT_Listener_2  req2  =  pop_queue();  process(req2);  update(log_buf);  

UNIT_Worker_2  

req3  =  accept();  push_queue(req3);  

UNIT_Listener_3  req3  =  pop_queue();  process(req3);  update(log_buf);  

UNIT_Worker_3  

           Characteris9cs  of  workflow  objects  

     ✔  Dynamically  allocated  at  run-­‐Xme        ✔  Never  shared  by  units  from  the  same  loop  

Page 60: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

Outline  

§  Identifying UNITs�

§  Inter-UNIT Dependences�

§  How BEEP Works�

§  Experimental Results�

§  Conclusion �

Page 61: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

How  BEEP  Works  StaXc  Analysis  

Page 62: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

How  BEEP  Works  StaXc  Analysis  

Training  Runs  

Page 63: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

How  BEEP  Works  StaXc  Analysis  

Training  Runs  

InstrumentaXon  

Page 64: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

How  BEEP  Works  StaXc  Analysis  

Training  Runs  

InstrumentaXon  

Logging  

Symptom  &    Log  Analysis  

Page 65: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

Outline  

§  Identifying UNITs�

§  Inter-UNIT Dependences�

§  How BEEP Works�

§  Experimental Results�

§  Conclusion �

Page 66: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

Logging  Overhead  (Time)  

0.0%

0.2%

0.4%

0.6%

0.8%

1.0%

1.2%

1.4%

Tim

e O

verh

ead(

%)

Page 67: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

Logging  Overhead  (Time)  

0.0%

0.2%

0.4%

0.6%

0.8%

1.0%

1.2%

1.4%

Tim

e O

verh

ead(

%)

Page 68: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

Logging  Overhead  (Space)  App   Log  Size  (MB)   Space  Overhead  

(%)  Original   Instrumented  

Apache   0.227   0.278   22.4%  

Vim   85.75   89.4   4.2%  

Firefox   46.12   63.25   37.1%  

Cherokee   1.1   1.126   2.2%  

W3M   26.88   28.39   5.6%  

Prodpd   0.54   0.56   2.3%  

Wget   17.47   17.48   0.6%  

Yafc   7.29   7.41   1.6%  

Transmission   4.92   5.37   9.1%  

Pine   5.62   6.21   10.5%  

Bash   0.63   0.64   0.6%  

MC   1.56   1.67   6.9%  

Sshd   2.2   2.22   0.8%  

Sendmail   0.214   0.22   2.8%  

Page 69: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

Logging  Overhead  (Space)  App   Log  Size  (MB)   Space  Overhead  

(%)  Original   Instrumented  

Apache   0.227   0.278   22.4%  

Vim   85.75   89.4   4.2%  

Firefox   46.12   63.25   37.1%  

Cherokee   1.1   1.126   2.2%  

W3M   26.88   28.39   5.6%  

Prodpd   0.54   0.56   2.3%  

Wget   17.47   17.48   0.6%  

Yafc   7.29   7.41   1.6%  

Transmission   4.92   5.37   9.1%  

Pine   5.62   6.21   10.5%  

Bash   0.63   0.64   0.6%  

MC   1.56   1.67   6.9%  

Sshd   2.2   2.22   0.8%  

Sendmail   0.214   0.22   2.8%  

Page 70: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

Logging  Overhead  (Space)  App   Log  Size  (MB)   Space  Overhead  

(%)  Original   Instrumented  

Apache   0.227   0.278   22.4%  

Vim   85.75   89.4   4.2%  

Firefox   46.12   63.25   37.1%  

Cherokee   1.1   1.126   2.2%  

W3M   26.88   28.39   5.6%  

Prodpd   0.54   0.56   2.3%  

Wget   17.47   17.48   0.6%  

Yafc   7.29   7.41   1.6%  

Transmission   4.92   5.37   9.1%  

Pine   5.62   6.21   10.5%  

Bash   0.63   0.64   0.6%  

MC   1.56   1.67   6.9%  

Sshd   2.2   2.22   0.8%  

Sendmail   0.214   0.22   2.8%  

Page 71: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

Logging  Overhead  (One  day)  

700MB  

               101MB    (14.42%  overhead)  

Server  programs   UI  programs  

Sshd-­‐5.9,  Sendmail-­‐8.12.11  Prodpd-­‐1.3.4,  Apache-­‐2.2.21  

Cherokee-­‐1.2.1  

Wget-­‐1.13,  W3m-­‐0.5.2,  Pine-­‐4.64,  MC-­‐4.6.1,  Vim-­‐7.3,  Bash-­‐4.2,  Firefox-­‐11,  

Yafc-­‐1.1.1,  Transmission-­‐2.6  

Page 72: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

Effec9veness  §  Attack Ramification �•  332 times smaller�

§  Information Stealing �•  7.8 times smaller�

#  of  Processes   #  of  Files   #  of  Sockets   #  of  Edges  

Previous  approach   2   11   38   51  

BEEP   2   4   1   6  

#  of  Processes   #  of  Files   #  of  Sockets   #  of  Edges  

Previous  approach   348   512   0   2,135  

BEEP   4   1   0   4  

Page 73: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

Related  Works  

§  Coarse-grained analysis � [S. T. King et al. ’03], [A. Goel et al. ’05] �§ Whole system replay� [T. Kim et al. ’10], [R. Chandra et al. ’11] �§  Coarse-grained + instruction logging � [X. Wang et al. ’09], [P. Barham et al. ’04] �§  Instruction level information flow � [H. Yin et al. ’07], [B. C. Tak et al. ’09] �

Page 74: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

Conclusion  

§  BEEP : Binary-based ExEcution Partition �•  Highly accurate provenance tracing technique�− Fine-grained autonomous units�− Reverse engineer units and unit dependences�− Without source code�

•  Negligible runtime overhead (1.4% max.)�•  Low space overhead (12.28% avg.)�•  Effective (40.93 times smaller causal graph)�

Page 75: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim
Page 76: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

Unit  Loop  Detec9on  

§ Missing unit loops�•  Undesirable unification of dependences�− May cause false dependences�− Does not affect correctness�

§  Including bogus loops �•  Unit dependence analysis will detect the

causal dependences between two loops�− Cause additional overhead�− Does not affect correctness�

Page 77: High%Accuracy%AackProvenance%via% Binary4Based …...Introduc9on! Attack provenance is important • Determine the “entry point” of an attack • Understand the damage to the victim

Unit  Dependence  Detec9on  

§ Missing unit dependences�•  BEEP detects the workflow dependences�− As long as trainings cover “some” of instructions�

§  Including low level dependences�•  May lead to bogus inter-unit dependences�− Does not affect correctness�