Download ppt - Malware Detection

Transcript
Page 1: Malware Detection

Malware DetectionSlides courtesy of Mihai Christodorescu

Page 2: Malware Detection

University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 2

The Rising Malware Tide• Malware is software with unwanted

functionality.Viruses, trojans, backdoors, bots, adware, spyware, browser hijackers, downloaders, droppers, keyloggers, password stealers, ...

• “Blended” threats

100,000,000 machines are infected.[Vint Cerf, World Economic Forum 2007]

Page 3: Malware Detection

University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 3

Organized Cyber-Crime• Boom in online fraud:

– Spamming– Trade in stolen data– Financial fraud– ID theft

Malware is the tool of the trade.

Page 4: Malware Detection

University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 4

The Changing Threat Landscape

1995: Hobby malware, for fun• Show programming prowess• Single author

2007: Professional malware, for profit• Collaborative development• Bug-fix releases, code reuse

Botnets: distributed computing has finally arrived.

Creator of the Melissa

worm

?

Page 5: Malware Detection

University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 5

Failure of Signature Detectors

Malware detectors still use signatures.

Malware is obfuscated/transformed easily.Software diversity used successfully by malware.

Internet

ac028c0e86009d8edfac0ac075fbe81cfd72ef50b91000f7f15052b90:*:504b03040a0001000800*...*:188420:181779:*:8ad6900f5088cab9356678e43c...3:*:3e3c623e6c696e6b3c2f6...

Virus Scanner

Known Malware

New Malware 1New Malware 2

Paradigm shift in malware creation,

yet no change in malware detection!

Page 6: Malware Detection

University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 6

Focus On Behavior

[Kaspersky Labs, Symantec]

2001 2002 2003 2004 2005 2006

10

1,000

100,000

10,000

1

8,82111,136 20,731

31,726 53,95086,876

New malware & malware families

Time

100

325 335 274 202 (est.)

A family is a collection of behaviors.A behavior can be shared by many families.

Family = malware with a common code base.

Number of families

stays constant.

Number of variants

grows exponentiall

y.

Page 7: Malware Detection

University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 7

Main thesis

Detection of obfuscated malware requires a semantic analysis of program behavior.

Program verification provides the techniques necessary to perform malware

detection effectively and efficiently.

Page 8: Malware Detection

University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 8

Specifying Behavior

Byte signatures allow for fast detection.– But not resilient to obfuscation.

High-level descriptions require expensive detection.– Resilient to obfuscation.

Syntactic Semantic

Execution of program M causes the system to reach a state where a copy of M has been sent by email.

“”

Page 9: Malware Detection

University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 9

Connect

Send

Malspec: Self-Propagation by Email

Netsky.B

push 10hpush eaxpush edicall connectpush esipush eaxpush [ebp+hMem]call wsprintfAadd esp, 0Chpush [ebp+hMem]call lstrlenApush 0push eaxpush [ebp+hMem]push ebxpush eaxpush ecxpush edicall send

Page 10: Malware Detection

University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 10

Connect

Send

push 10hpush eaxpush edicall connectpush esipush eaxpush [ebp+hMem]call wsprintfAadd esp, 0Chpush [ebp+hMem]call lstrlenApush 0push eaxpush [ebp+hMem]push ebxpush eaxpush ecxpush edicall send

Netsky.B

X := Arg1

Arg1 = X &Arg2 = “EHLO.*”

= +Semantic component

describesdependency constraints.

Syntactic component describestemporal

constraints.

Malspec: Self-Propagation by Email

Page 11: Malware Detection

University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 11

“Read Own Exe. Image”“Send Email”

Building a Real Malspec

send(X,“DATA”)

X:=socket()

connect(X)

send(X,“EHLO”)

send(X,T)

Y:=read(Z)

Z:=open(S)

S:=process_name()

Page 12: Malware Detection

University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 12

“Read Own Exe. Image”“Send Email”

send(X,“DATA”)

Building a Real MalspecX:=socket()

connect(X)

send(X,“EHLO”)

Y:=read(Z)

send(X,T)

Z:=open(S)

S:=process_name()

send(X,T))),Base64(l(StringEqua YT

Page 13: Malware Detection

University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 13

send(X,“DATA”)

Malspec ConstraintsX:=socket()

connect(X)

send(X,“EHLO”)

Y:=read(Z)

send(X,T))),Base64(l(StringEqua YT

Z:=open(S)

S:=process_name()Local constraint

Dependence constraint:X after socket = X before connect

Dependence constraint

AutomatingMalspec Creation:Malspec Mining

MalwareSample

BenignProgramBenign

ProgramBenignProgramBenign

Program

Page 14: Malware Detection

University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 14

Malspecs Benefits

X:=socket()

connect(X)

send(X,“EHLO”)

send(X,“DATA”)Y:=read(Z)

send(X,T))),Base64(l(StringEqua YT

Z:=open(S)

S:=process_name()

Choice of security-sensitive operations

Constraint-based execution order

Dependences free of obfuscation artifacts

Expressive to describe even obfuscated behavior.

Page 15: Malware Detection

University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 15

Malspec Detection Strategies• Static analysis

• Dynamic analysis

• Host-based IDS

• Inline Reference Monitors

X:=socket()

connect(X)

send(X,“EHLO”)

send(X,“DATA”)Y:=read(Z)

send(X,T))),Base64(l(StringEqua YT

Z:=open(S)

S:=process_name()

Malspecs are independent of detection method.

Page 16: Malware Detection

University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 16

Detection of Malicious Behavior

BinaryFile

MalwareDetector

X:=socket()

connect(X)

send(X,“EHLO”)

send(X,“DATA”) Y:=read(Z)send(X,T)

)),Base64(l(StringEqua YT

Z:=open(S)

S:=process_name()

Goal: Find a program path that matches the malspec.

Page 17: Malware Detection

University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 17

Find A Malicious Program Path

X:=socket()

connect(X)

send(X,“EHLO”)

send(X,“DATA”) Y:=read(Z)send(X,T)

)),Base64(l(StringEqua YT

Z:=open(S)

S:=process_name()

Interprocedural Control-Flow Graph

Page 18: Malware Detection

University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 18

1) Match Malspec OperationsX:=socket()

connect(X)

send(X,“EHLO”)

send(X,“DATA”)Y:=read(Z)

send(X,T))),Base64(l(StringEqua YT

Z:=open(S)

S:=process_name()

Page 19: Malware Detection

University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 19

2) Match Malspec Constraints

X:=socket()

connect(X)

send(X,“EHLO”)

send(X,“DATA”)Y:=read(Z)

send(X,T))),Base64(l(StringEqua YT

Z:=open(S)

S:=process_name()

Malspec Constraint:Z after open = Z before

read

Program Constraint:The program fragment preserves the program expression bound to Z.

Like a semantic def-use

constraint.

Page 20: Malware Detection

University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 20

2) Match Malspec Constraints

Program Constraint:The program fragment preserves the program expression bound to Z.

Semantic nop wrt E = program fragment preserving an expression E.

Page 21: Malware Detection

University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 21

2) Match Malspec Constraints

Program Constraint:The program fragment preserves the program expression bound to Z.

Need an Oracle...

Page 22: Malware Detection

University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 22

Advances in Decision Procedures

Dramatic improvements in SAT solvers:– SATO [Zhang, CADE 1997]– GRASP [Marques-Silva & Sakallah, 1999]– zChaff [Moskewicz et al., DAC 2001]– BerkMin [Goldberg & Novikov, DATE 2002]

SAT-based Bounded Model Checking:[Clarke et al., FMSD 2001]

– SAT-specific speedups [Strichman, CHARME 2001]– Richer logics [Seshia et al., DAC 2003]

A decision procedure can approx. an Oracle.

Page 23: Malware Detection

University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 23

Using Decision Procedures

Program Constraint:The program fragment preserves the program expression bound to Z.

Decisionprocedure

P True/False

P

add esp, 0Chpush[ebp+hMem]

P

P

4

12

12

01

01

espesphMemebpmemoryespmemory

espesp

Page 24: Malware Detection

University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 24

Semantics-AwareMalware Detector

Semantics-Aware Detector

Disassembler CFGconstructor

BinaryFile

CFG

Graphmatching

Malspec

Malspec operations

Malspec constraints

Yes / No

IDA Pro

[Detlefs et al., “Simplify,” 2004][Lahiri & Seshia, CAV 2004]

Constraintsatisfaction

Simplify

UCLID

Page 25: Malware Detection

University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 25

Effective DetectionWith hard-coded semantic-nop

patterns:

With decision procedures:

Commercial AV

SAFE

Known malware 100% 100%Obfuscated variants

0% 100%

Malspec source

Variants detected

# of AV signatures

# of SAMD malspecs

Netsky.B C,D,O,P,T,W 7 1Bagle.I J,N,O,P,R,Y 7 1

Page 26: Malware Detection

University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 26

Semantic-Nop Detection BenefitsSemantic-Nop features:• Flow sensitivity• Binding procedure• Decision procedures• Rich constraints

Obfuscation resilience:

• Code reordering• Register renaming• Junk code• Code substitution

Page 27: Malware Detection

University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 27

Detection Performance

300–800 s

Powerful decision procedures are expensive.

1–9 s

Simplify theorem proverUCLID

bounded model

checker

SAFE pattern matching

Idea:Use expensive decision

procedures only if cheap decision procedures do not provide a

definitive answer.

Page 28: Malware Detection

University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 28

Stack of Decision Procedures

Simplify theorem proverUCLID

bounded model

checker

SAFE pattern matching

Random execution

Average cost, same decision power.

Yes

No

Yes

Yes/No

“No, code does not

satisfy constraint!”

Constraint

Program fragment ?

?

?

Page 29: Malware Detection

University of Wisconsin, Madison Mihai Christodorescu – “Behavior-based Malware Detection” 29

Performance Results

Malware Minimum Average MaximumNetsky

(B,C,D,O,P,T,W)

Bagle(I,J,N,O,P,R,Y)

Bagle(obfuscated

variants)

Detection times in seconds

60.56 99.57 140.08

36.00 56.41 97.13

74.81 140.14 186.50

Test setup: 1 GHz CPU, 1 GB RAMComparison:

Commercial signature-based detector: <1sDecision procedure-based detector: >300s


Recommended