23
Malware-Aware Processors: A Framework for Efficient Online Malware Detection Meltem Ozsoy * , Caleb Donovick * , Iakov Gorelik * , Nael Abu-Ghazaleh ** and Dmitry Ponomarev * * Binghamton University, ** University of California, Riverside HPCA 2015 - San Francisco, CA

Malware-Aware Processors: A Framework for Efficient Online Malware Detection Meltem Ozsoy *, Caleb Donovick *, Iakov Gorelik *, Nael Abu-Ghazaleh ** and

Embed Size (px)

Citation preview

Page 1: Malware-Aware Processors: A Framework for Efficient Online Malware Detection Meltem Ozsoy *, Caleb Donovick *, Iakov Gorelik *, Nael Abu-Ghazaleh ** and

HPCA 2015 - San Francisco, CA

Malware-Aware Processors: A Framework

for Efficient Online Malware Detection

Meltem Ozsoy*, Caleb Donovick*, Iakov Gorelik*,Nael Abu-Ghazaleh** and Dmitry Ponomarev*

* Binghamton University, ** University of California, Riverside

Page 2: Malware-Aware Processors: A Framework for Efficient Online Malware Detection Meltem Ozsoy *, Caleb Donovick *, Iakov Gorelik *, Nael Abu-Ghazaleh ** and

HPCA 2015 - San Francisco, CA

Malware Growth

Anti-virussoftware

OS Level Defenses

Execution Monitoring

AV Test Malware Statistics,2014 (http://www.av-test.org/en/statistics/malware/)

Page 3: Malware-Aware Processors: A Framework for Efficient Online Malware Detection Meltem Ozsoy *, Caleb Donovick *, Iakov Gorelik *, Nael Abu-Ghazaleh ** and

HPCA 2015 - San Francisco, CA

What This Work is All About

• Comprehensive execution monitors are too heavy-weight to be always-on– Performance loss

• Low-level indicators were shown to be effective to classify malware– Demme et al. (ISCA 2013) proposed offline detection using

performance counters– Our contribution: online detection in hardware

• Hardware classifies are not perfect, thus:– Two Level Detection Framework: Use hardware-based

detector to prioritize the work of heavy-weight software detector

Page 4: Malware-Aware Processors: A Framework for Efficient Online Malware Detection Meltem Ozsoy *, Caleb Donovick *, Iakov Gorelik *, Nael Abu-Ghazaleh ** and

HPCA 2015 - San Francisco, CA

Two Level Detection Framework

Page 5: Malware-Aware Processors: A Framework for Efficient Online Malware Detection Meltem Ozsoy *, Caleb Donovick *, Iakov Gorelik *, Nael Abu-Ghazaleh ** and

HPCA 2015 - San Francisco, CA

Malware Detection

• Static Analysis – Study program without execution– Signature generation with byte/instruction

sequences– Using source code, CFG generation

• Limitations of Static Analysis– Requires source code, disassembly– Metamorphic malware (Self Modifying Code)– Polymorphic (encrypted) malware– Non-deterministic inputs can change program flow

Page 6: Malware-Aware Processors: A Framework for Efficient Online Malware Detection Meltem Ozsoy *, Caleb Donovick *, Iakov Gorelik *, Nael Abu-Ghazaleh ** and

HPCA 2015 - San Francisco, CA

Malware Detection

• Dynamic Analysis– System calls, function parameters, API

calls, created processes/threads, etc. monitored

– Expensive, uses VM or emulator

• Limitations of Dynamic Analysis– Only effective against analyzed malware– Advanced Persistent Threats (APTs) can

bypass with zero-day exploits

Page 7: Malware-Aware Processors: A Framework for Efficient Online Malware Detection Meltem Ozsoy *, Caleb Donovick *, Iakov Gorelik *, Nael Abu-Ghazaleh ** and

HPCA 2015 - San Francisco, CA

VM VM

Execution Monitoring

• Systemcall Forwarding– Proxos (OSDI’06)

• VM Introspection, Isolated Monitoring– Livewire(NDSS’03), Virtuoso (IEEE Security & Privacy’11)

• Reference Monitoring– PinOS(ACM VEE’07), Kernel DBT(ASPLOS’12)

Application

EM

EM Application

Modified Application

EM

Kernel Kernel Kernel

Page 8: Malware-Aware Processors: A Framework for Efficient Online Malware Detection Meltem Ozsoy *, Caleb Donovick *, Iakov Gorelik *, Nael Abu-Ghazaleh ** and

HPCA 2015 - San Francisco, CA

Malware Detection at Low-level

• Sub-semantic Monitoring– Low-level indicators of program such as

Performance Counters (Demme et al. ISCA’13) are monitored

• Limitations– Detection is after the fact– Not real-time– Features are limited to available

performance counters

Page 9: Malware-Aware Processors: A Framework for Efficient Online Malware Detection Meltem Ozsoy *, Caleb Donovick *, Iakov Gorelik *, Nael Abu-Ghazaleh ** and

HPCA 2015 - San Francisco, CA

Our Proposal: MAP

• Malware Aware Processor (MAP)– Use hardware for sub-semantic

detection• Train a simple machine learning algorithm• Periodic checks during execution

– Perform online detection using time series analysis in hardware

– High overhead software analysis activated only for suspicious programs (Two Level Detection)

Page 10: Malware-Aware Processors: A Framework for Efficient Online Malware Detection Meltem Ozsoy *, Caleb Donovick *, Iakov Gorelik *, Nael Abu-Ghazaleh ** and

HPCA 2015 - San Francisco, CA

MAP Design OverviewIn

stru

ctio

n F

etc

h

Instruction Cache

Rename/Decode

Branch Prediction

Physical Register File

Issue

Functional Units

ROB & Architectural Register File

MMU

Data Cache

Exception Unit

MAP

Page 11: Malware-Aware Processors: A Framework for Efficient Online Malware Detection Meltem Ozsoy *, Caleb Donovick *, Iakov Gorelik *, Nael Abu-Ghazaleh ** and

HPCA 2015 - San Francisco, CA

Sub-Semantic Feature Space

• ArchitecturalARCH : Frequency of memory read/writes, taken & immediate branches and unaligned memory accesses

• Memory AddressMEM1 : Frequency of memory address distance histogram MEM2 : Memory address distance histogram mix

• InstructionINS1 : Frequency of instruction categories INS2 : Difference between two most frequent opcodes INS3 : Existence of categories INS4 : Existence of opcodes

Page 12: Malware-Aware Processors: A Framework for Efficient Online Malware Detection Meltem Ozsoy *, Caleb Donovick *, Iakov Gorelik *, Nael Abu-Ghazaleh ** and

HPCA 2015 - San Francisco, CA

Machine Learning Algorithms

• Logistic Regression– Hypothesis function (ax1+bx2+ ... +c) is

trained to figure out weights (a, b, c)– Sigmoid function translates the

hypothesis function to a value (0 – 1)

• Neural Network (multi layer perceptron)– One hypothesis function trained for each

layer– Translation function is tanh

Page 13: Malware-Aware Processors: A Framework for Efficient Online Malware Detection Meltem Ozsoy *, Caleb Donovick *, Iakov Gorelik *, Nael Abu-Ghazaleh ** and

HPCA 2015 - San Francisco, CA

Data Set & Data Collection

Family Train Test ValExtended

Test TotalVundo 14 2 5 21 42Emerleox 10 5 4 33 52Virut 8 3 7 46 64Sality 12 2 4 46 64Ejik 7 6 4 101 118Looper 10 3 6 145 164AdRotator 14 1 2 119 136PornDialer 11 6 4 196 217Boaxxe 13 6 0 211 230Total 99 34 36 918 1087

• 32-bit Windows 7 on VirtualBox

• Windows Security Services disabled

• Features collected through PIN during execution of malware

• University Of Mannheim dataset

• Offensive Computing• VirusTotal

Page 14: Malware-Aware Processors: A Framework for Efficient Online Malware Detection Meltem Ozsoy *, Caleb Donovick *, Iakov Gorelik *, Nael Abu-Ghazaleh ** and

HPCA 2015 - San Francisco, CA

Selecting Features for Classification

• Offline detection performance• Low hardware implementation

complexity

Used for hardware implementation

Page 15: Malware-Aware Processors: A Framework for Efficient Online Malware Detection Meltem Ozsoy *, Caleb Donovick *, Iakov Gorelik *, Nael Abu-Ghazaleh ** and

HPCA 2015 - San Francisco, CA

Key Aspects of MAP Operation

• Machine Learning model trained at design time

• Weights for the model are loaded into MAP hardware

• While program executes, MAP hardware collects features at instruction commit stage

• For each 10K committed instructions, a binary decision (malware/regular) is made

Page 16: Malware-Aware Processors: A Framework for Efficient Online Malware Detection Meltem Ozsoy *, Caleb Donovick *, Iakov Gorelik *, Nael Abu-Ghazaleh ** and

HPCA 2015 - San Francisco, CA

MAP Online Detection

• Periodic binary signals created for 10K instructions during execution

• Exponentially Weighted Moving Average (EWMA) is used for filtering out occasional false positives/negatives

• Additional optimizations for efficient hardware implementation– Fixed Point representation– Sliding window of signals

Page 17: Malware-Aware Processors: A Framework for Efficient Online Malware Detection Meltem Ozsoy *, Caleb Donovick *, Iakov Gorelik *, Nael Abu-Ghazaleh ** and

HPCA 2015 - San Francisco, CA

Hardware Implementation

Logistic RegressionNeural Network

Page 18: Malware-Aware Processors: A Framework for Efficient Online Malware Detection Meltem Ozsoy *, Caleb Donovick *, Iakov Gorelik *, Nael Abu-Ghazaleh ** and

HPCA 2015 - San Francisco, CA

MAP FPGA Implementation

Page 19: Malware-Aware Processors: A Framework for Efficient Online Malware Detection Meltem Ozsoy *, Caleb Donovick *, Iakov Gorelik *, Nael Abu-Ghazaleh ** and

HPCA 2015 - San Francisco, CA

Example of EWMA

Logistic Regression

Neural Network

Page 20: Malware-Aware Processors: A Framework for Efficient Online Malware Detection Meltem Ozsoy *, Caleb Donovick *, Iakov Gorelik *, Nael Abu-Ghazaleh ** and

HPCA 2015 - San Francisco, CA

Results

Page 21: Malware-Aware Processors: A Framework for Efficient Online Malware Detection Meltem Ozsoy *, Caleb Donovick *, Iakov Gorelik *, Nael Abu-Ghazaleh ** and

HPCA 2015 - San Francisco, CA

Key Results of MAP

• Best performing feature is based on instruction opcodes

• MAP achieves 89% real-time detection with only 6% false positives with a simple LR prediction

• Physical design overhead– Cycle time 1.9%(LR), 5.5%(NN)– Area 0.3%(LR), 5.7%(NN) – Power 0.1%(LR), 1.7%(NN)

Page 22: Malware-Aware Processors: A Framework for Efficient Online Malware Detection Meltem Ozsoy *, Caleb Donovick *, Iakov Gorelik *, Nael Abu-Ghazaleh ** and

HPCA 2015 - San Francisco, CA

Future Directions

• MAP can be extended as a configurable malware detection engine– Updating weights for new malware– Configuring features

• Integrated FPGAs in new CPU designs (Intel Xeon) can be used for MAP

Page 23: Malware-Aware Processors: A Framework for Efficient Online Malware Detection Meltem Ozsoy *, Caleb Donovick *, Iakov Gorelik *, Nael Abu-Ghazaleh ** and

HPCA 2015 - San Francisco, CA

THANK YOU!

Questions?