Mathematical model of computer viruses · Mathematical model of computer viruses Ferenc Leitold,...

Preview:

Citation preview

Mathematical modelof computer virusesMathematical modelMathematical modelofof computercomputer virusesviruses

FerencFerenc LeitoldLeitold,,Hunix LtdHunix Ltd., Hungary., Hungary

fleitoldfleitold@@hunixhunix.hu.hu

Table of contentsTable of contents

• Models of computation• Operating system• Virus definition• What can we do with this

mathematical model ?

Turing MachineTuring Machine

Turing MachineTuring Machine

Turing MachineTuring Machine

Turing MachineTuring Machine

Turing MachineTuring MachineT ,S,I, ,b,q ,q >0 f=<=<=<=< Q δδδδ

QQ

qq00

qqff

S: tape symbolsI: input symbols,b: blank symbol,

: move function,δδδδ

I S⊂⊂⊂⊂b S I∈∈∈∈ \

{{{{ }}}}δδδδ: , ,Q S Q S l r s×××× →→→→ ×××× ××××

RandomAccessMachine

RandomRandomAccessAccessMachineMachine

RandomAccessMachine

RandomRandomAccessAccessMachineMachine mm00

mm11

mm22

mm33

mm44

......

AccumulatorAccumulator

RASPMRASPMRASPM

RASPM with ABSRASPMRASPM withwith ABSABS

RASPM with SABSRASPMRASPM withwith SABSSABS

RASPMRASPM withwith ABSABSdefinitiondefinition

M: initial memory contentq: initial value of the IP

T: set of processor’s activitiesU: operation codes,

V: set of symbols

G = <V,U,T,f,q,M>G = <V,U,T,f,q,M>G = <V,U,T,f,q,M>

U V⊆⊆⊆⊆

f U T: →→→→

Instruction setInstruction set• move (LOAD, STORE)• logical (AND, OR, XOR)• arithmetic (ADD, SUB, MULT, DIV)• branch (JUMP, JGTZ, JZERO)• input/output tape handling

(READ, WRITE)• background tape handling

(GET, PUT, SEEK, SETDRIVE)

Operating SystemOperating System

• system of programs• able to handle separate program

or data files• able to make a specified program

to run.

Operating SystemsOperating Systemsunderunder RASPMRASPM withwith ABSABS

Operating SystemsOperating Systemsunderunder RASPMRASPM withwith ABSABS

• The OS is in the initial memory (M)

Operating SystemsOperating Systemsunderunder RASPMRASPM withwith ABSABS

• The OS is in the initial memory (M)� OS specific machine

Operating SystemsOperating Systemsunderunder RASPMRASPM withwith ABSABS

• The OS is in the initial memory (M)� OS specific machine

• The OS is in the background tape

Operating SystemsOperating Systemsunderunder RASPMRASPM withwith ABSABS

• The OS is in the initial memory (M)� OS specific machine

• The OS is in the background tape� OS independent machine

Operating SystemsOperating Systemsunderunder RASPMRASPM withwith ABSABS

• The OS is in the initial memory (M)� OS specific machine

• The OS is in the background tape� OS independent machine

• The OS is in the input tape

Operating SystemsOperating Systemsunderunder RASPMRASPM withwith ABSABS

• The OS is in the initial memory (M)� OS specific machine

• The OS is in the background tape� OS independent machine

• The OS is in the input tape� unusable

ComparingComparingRASPMRASPM withwith ABSABS--eses

GGG111=<V=<V=<V111 ,U,U,U111 ,T,T,T111 ,f,f,f111 ,q,q,q111 ,M,M,M111>>>GGG222=<V=<V=<V222 ,U,U,U222 ,T,T,T222 ,f,f,f222 ,q,q,q222 ,M,M,M222>>>

ComparingComparingRASPMRASPM withwith ABSABS--eses

GGG111=<V=<V=<V111 ,U,U,U111 ,T,T,T111 ,f,f,f111 ,q,q,q111 ,M,M,M111>>>GGG222=<V=<V=<V222 ,U,U,U222 ,T,T,T222 ,f,f,f222 ,q,q,q222 ,M,M,M222>>>

{q{q{q111 ,M,M,M111} {q} {q} {q222 ,M,M,M222}}}≠≠≠≠

ComparingComparingRASPMRASPM withwith ABSABS--eses

GGG111=<V=<V=<V111 ,U,U,U111 ,T,T,T111 ,f,f,f111 ,q,q,q111 ,M,M,M111>>>GGG222=<V=<V=<V222 ,U,U,U222 ,T,T,T222 ,f,f,f222 ,q,q,q222 ,M,M,M222>>>

{q{q{q111 ,M,M,M111} {q} {q} {q222 ,M,M,M222}}}≠≠≠≠

•• different operating systemsdifferent operating systems•• different loaderdifferent loader programprogram

ComparingComparingRASPMRASPM withwith ABSABS--eses

GGG111=<V=<V=<V111 ,U,U,U111 ,T,T,T111 ,f,f,f111 ,q,q,q111 ,M,M,M111>>>GGG222=<V=<V=<V222 ,U,U,U222 ,T,T,T222 ,f,f,f222 ,q,q,q222 ,M,M,M222>>>

ComparingComparingRASPMRASPM withwith ABSABS--eses

GGG111=<V=<V=<V111 ,U,U,U111 ,T,T,T111 ,f,f,f111 ,q,q,q111 ,M,M,M111>>>GGG222=<V=<V=<V222 ,U,U,U222 ,T,T,T222 ,f,f,f222 ,q,q,q222 ,M,M,M222>>>

{f{f{f111 ,T,T,T111 ,U,U,U111} {f} {f} {f222 ,T,T,T222 ,U,U,U222}}}≠≠≠≠

ComparingComparingRASPMRASPM withwith ABSABS--eses

GGG111=<V=<V=<V111 ,U,U,U111 ,T,T,T111 ,f,f,f111 ,q,q,q111 ,M,M,M111>>>GGG222=<V=<V=<V222 ,U,U,U222 ,T,T,T222 ,f,f,f222 ,q,q,q222 ,M,M,M222>>>

{f{f{f111 ,T,T,T111 ,U,U,U111} {f} {f} {f222 ,T,T,T222 ,U,U,U222}}}≠≠≠≠

•• different instruction setsdifferent instruction sets ((activitiesactivities))•• different sets of operation codesdifferent sets of operation codes•• different operation codesdifferent operation codes

ComparingComparingRASPMRASPM withwith ABSABS--eses

GGG111=<V=<V=<V111 ,U,U,U111 ,T,T,T111 ,f,f,f111 ,q,q,q111 ,M,M,M111>>>GGG222=<V=<V=<V222 ,U,U,U222 ,T,T,T222 ,f,f,f222 ,q,q,q222 ,M,M,M222>>>

ComparingComparingRASPMRASPM withwith ABSABS--eses

GGG111=<V=<V=<V111 ,U,U,U111 ,T,T,T111 ,f,f,f111 ,q,q,q111 ,M,M,M111>>>GGG222=<V=<V=<V222 ,U,U,U222 ,T,T,T222 ,f,f,f222 ,q,q,q222 ,M,M,M222>>>

VVV111 VVV222≠≠≠≠

ComparingComparingRASPMRASPM withwith ABSABS--eses

GGG111=<V=<V=<V111 ,U,U,U111 ,T,T,T111 ,f,f,f111 ,q,q,q111 ,M,M,M111>>>GGG222=<V=<V=<V222 ,U,U,U222 ,T,T,T222 ,f,f,f222 ,q,q,q222 ,M,M,M222>>>

VVV111 VVV222≠≠≠≠

•• different symbolsdifferent symbols•• different tape formatsdifferent tape formats

ComputerComputer virusvirus

ComputerComputer virusvirus

• a (part of) program

ComputerComputer virusvirus

• a (part of) program• it is attached to a program area

ComputerComputer virusvirus

• a (part of) program• it is attached to a program area• it is able to link itself to other

program areas

ComputerComputer virusvirus

• a (part of) program• it is attached to a program area• it is able to link itself to other

program areas• it is executed when the host

program area is to be executed

Virus spreading modesVirus spreading modes

Virus spreading modesVirus spreading modes

• machine specific

Virus spreading modesVirus spreading modes

• machine specific• machine independent

Virus spreading modesVirus spreading modes

• machine specific• machine independent• operating system specific

Virus spreading modesVirus spreading modes

• machine specific• machine independent• operating system specific• operating system independent

Virus spreading modesVirus spreading modes

• machine specific• machine independent• operating system specific• operating system independent• direct

Virus spreading modesVirus spreading modes

• machine specific• machine independent• operating system specific• operating system independent• direct• indirect

What can we do with thisWhat can we do with thismathematical modelmathematical model ??

What can we do with thisWhat can we do with thismathematical modelmathematical model ??

• Examining virus detection problem

What can we do with thisWhat can we do with thismathematical modelmathematical model ??

• Examining virus detection problem• Examining searching techniques

What can we do with thisWhat can we do with thismathematical modelmathematical model ??

• Examining virus detection problem• Examining searching techniques• Examining polymorphic viruses

What can we do with thisWhat can we do with thismathematical modelmathematical model ??

• Examining virus detection problem• Examining searching techniques• Examining polymorphic viruses• Examining multiplatform viruses

General virusGeneral virusdetection problemdetection problem

It is impossible to build a TuringMachine which could decide if anexecutable file in a RASPM withABS contains a virus or not.

TheoremTheorem::

General virusGeneral virusdetection problemdetection problem

ProofProof::

Host program Virus

General virusGeneral virusdetection problemdetection problem

ProofProof::

Host program Virus TM prg

General virusGeneral virusdetection problemdetection problem

ProofProof::

Host program Virus TM prg TM input

General virusGeneral virusdetection problemdetection problem

ProofProof::

Host program Virus TM prg TM input

General virusGeneral virusdetection problemdetection problem

ProofProof::

Host program Virus TM prg TM input

General virusGeneral virusdetection problemdetection problem

ProofProof::

Host program Virus TM prg TM input

Virus detection problemVirus detection problem TMTM halting problemhalting problem

““An antiAn anti--virusvirus hashas itsits limit,limit,thanks to Turingthanks to Turing,,

andand aa virus can find those limitsvirus can find those limits,,exploit themexploit them,,

thanks tothanks to Darwin.”Darwin.”

from the Giant Black Book offrom the Giant Black Book of ComputerComputer VirusesViruses

Searching techniqueSearching techniquequestionsquestions

Searching techniqueSearching techniquequestionsquestions

•• For what kind of viruses canFor what kind of viruses can bebeusedused ??

Searching techniqueSearching techniquequestionsquestions

•• For what kind of viruses canFor what kind of viruses can bebeusedused ??

•• WhatWhat isis the probability of falsethe probability of falsealarmsalarms ??

Searching techniqueSearching techniquequestionsquestions

•• For what kind of viruses canFor what kind of viruses can bebeusedused ??

•• WhatWhat isis the probability of falsethe probability of falsealarmsalarms ??

•• WhatWhat isis the expense criteriathe expense criteria ??

Sequence searching algorithmSequence searching algorithm

Sequence searching algorithmSequence searching algorithm

• for non-polymorphic known viruses

Sequence searching algorithmSequence searching algorithm

• for non-polymorphic known viruses

• false alarms: p L MNn

≈≈≈≈⋅⋅⋅⋅

L:L: size of suspicious areasize of suspicious areaM:M: number of sequencesnumber of sequencesN:N: size ofsize of aa sequencesequencen:n: number of values in one cellnumber of values in one cell

Sequence searching algorithmSequence searching algorithm

• for non-polymorphic known viruses

• false alarms:

• expense criteria: P, polynomial

p L MNn

≈≈≈≈⋅⋅⋅⋅

≤≤≤≤ ⋅⋅⋅⋅ ⋅⋅⋅⋅L M N comparisions

L:L: size of suspicious areasize of suspicious areaM:M: number of sequencesnumber of sequencesN:N: size ofsize of aa sequencesequencen:n: number of values in one cellnumber of values in one cell

““HeuristicHeuristic”” algorithmalgorithm

““HeuristicHeuristic”” algorithmalgorithm

• for known viruses

““HeuristicHeuristic”” algorithmalgorithm

• for known viruses

• expense criteria:

Host program Decoder (cycle) Body

““HeuristicHeuristic”” algorithmalgorithm

• for known viruses

• expense criteria: NP

Host program Decoder (cycle) Body

Executes 2n cycle !

n

How can we measure theHow can we measure thepower of polymorphismpower of polymorphism ??

How can we measure theHow can we measure thepower of polymorphismpower of polymorphism ??

Host program Decoder Body

How can we measure theHow can we measure thepower of polymorphismpower of polymorphism ??

Host program Decoder Body

size of variable parts of the virusfull size of the virusαααα ====

How can we measure theHow can we measure thepower of polymorphismpower of polymorphism ??

Host program Decoder Body

size of variable parts of the virusfull size of the virusαααα ====

ββββ ==== number of variants of the decoders

Flowchart ofFlowchart of aa virusvirus

Flowchart ofFlowchart of aa virusvirussearch for an

uninfected program

Flowchart ofFlowchart of aa virusvirussearch for an

uninfected program

append virus

Flowchart ofFlowchart of aa virusvirussearch for an

uninfected program

append virus

choose a randominstruction in the virus

Flowchart ofFlowchart of aa virusvirussearch for an

uninfected program

append virus

choose a randominstruction in the virus

swap with the nextinstruction

Flowchart ofFlowchart of aa virusvirussearch for an

uninfected program

append virus

swap with the nextinstruction

choose a randominstruction in the virus

repeat100 times

Flowchart ofFlowchart of aa virusvirussearch for an

uninfected program

append virus

swap with the nextinstruction

choose a randominstruction in the virus

repeat100 times

Flowchart ofFlowchart of aa virusvirussearch for an

uninfected program

append virus

swap with the nextinstruction

choose a randominstruction in the virus

repeat100 times

DISKDISK

Name: RIPPERAliases: Jack RipperStatus: CommonOrigin: NorwayLength: 1024 bytes (2 sectors)Infect: MBR, Boot sectorOther: Resident, Stealth,

Disk corruption

Name: RIPPERAliases: Jack RipperStatus: CommonOrigin: NorwayLength: 1024 bytes (2 sectors)Infect: MBR, Boot sectorOther: Resident, Stealth,

Disk corruption

The virus swaps two words in the DOSwrite buffer. It occurs on a random basisof approximately 1 write in 1024 cases.

Multiplatform virusesMultiplatform virusesGGG111=<V=<V=<V111 ,U,U,U111 ,T,T,T111 ,f,f,f111 ,q,q,q111 ,M,M,M111>>>GGG222=<V=<V=<V222 ,U,U,U222 ,T,T,T222 ,f,f,f222 ,q,q,q222 ,M,M,M222>>>

Multiplatform virusesMultiplatform virusesGGG111=<V=<V=<V111 ,U,U,U111 ,T,T,T111 ,f,f,f111 ,q,q,q111 ,M,M,M111>>>GGG222=<V=<V=<V222 ,U,U,U222 ,T,T,T222 ,f,f,f222 ,q,q,q222 ,M,M,M222>>>

Conditions:

VVV111 UUU222 000UUU111 VVV222 000

� ≠≠≠≠� ≠≠≠≠

G1 has to know some operation codes of G2

G2 has to know some operation codes of G1

Multiplatform virusesMultiplatform virusesGGG111=<V=<V=<V111 ,U,U,U111 ,T,T,T111 ,f,f,f111 ,q,q,q111 ,M,M,M111>>>GGG222=<V=<V=<V222 ,U,U,U222 ,T,T,T222 ,f,f,f222 ,q,q,q222 ,M,M,M222>>>

Conditions:

UUU111 UUU222 000� ≠≠≠≠- The virus code can be the same..

Multiplatform virusesMultiplatform virusesGGG111=<V=<V=<V111 ,U,U,U111 ,T,T,T111 ,f,f,f111 ,q,q,q111 ,M,M,M111>>>GGG222=<V=<V=<V222 ,U,U,U222 ,T,T,T222 ,f,f,f222 ,q,q,q222 ,M,M,M222>>>

Conditions:

UUU111 UUU222 000

UUU111 UUU222 = 0= 0= 0

� ≠≠≠≠

- The virus code can be the same..

- The virus code must be different..

Recommended