Jack of all Formats

Preview:

DESCRIPTION

In this presentation, I discuss four different approaches to merging multiple files of different formats into one, such that it can be read as each type. I then discuss the security implications of this property inherent in many file formats, theorize about attacks which can be launched when developers assume that files can only be one format.

Citation preview

Jack of all Formats

Daniel “unicornFurnace” CrowleyApplication Security Services, Trustwave - SpiderLabs

Introductions

How can files be multiple formats?Why is this interesting from a security perspective?

What can we do about it?

(yo dawg we heard you like files so we put files in your files)

Copyright Trustwave 2010 Confidential

Terms

File piggybacking• Placing one file into another

File consumption• Parsing a file and interpreting its contents

Copyright Trustwave 2010 Confidential

Scope of this talk

Files which can be interpreted as multiple formats• …with at most a change of file extension

Covert channels• Through use of piggybacking

Examples are mostly Web-centric• Only because it’s my specialty• This concept applies to more than Web applications

− Srsly this applies to more than Web applications• GUYS IT’S NOT JUST WEB APPS

Files with multiple formats

How to piggyback files

(Clap and cheer now to confuse the people who can’t read this)

Copyright Trustwave 2010 Confidential

File format flexibility

Not always rigidly defined• From the PDF specification:

“This standard does not specify the following:……methods for validating the conformance of PDF files or readers…”

− Thank you Julia Wolf for “OMG WTF PDF”

• CSV comments exist but are not part of the standardNot all data in a file is parsed

• Metadata• Unreferenced blocks of data• Data outside start/end markers• Reserved, unused fields

Copyright Trustwave 2010 Confidential

File format flexibility

Some data can be interpreted multiple ways

Method of file consumption often determined by:• File extension

− Multiple file extensions may result in multiple parses

• Bytes at beginning of file• First identified file header

Copyright Trustwave 2010 Confidential

7zip file with junk data at the beginning

Copyright Trustwave 2010 Confidential

7zip file with junk data at the beginning

Multiple file extensions

Apache has:LanguagesHandlersMIME types

File.en.php.pngBasename – largely ignored

File.en.php.pngLanguage – US English

File.en.php.pngTriggers PHP handler

File.en.php.pngTriggers image/png MIME type

Copyright Trustwave 2010 Confidential

Metadata

Information about the file itself

Not always parsed by the file consumer

• “Comment” fields, few restrictions on data

Files can be inserted into comment fields for one

format

• ID3 tags for mp3 files will be shown in players

− But not usually interpreted

Copyright Trustwave 2010 Confidential

Metadata – GIF comment

Copyright Trustwave 2010 Confidential

Metadata – GIF comment

Copyright Trustwave 2010 Confidential

Unreferenced blocks of data

Certain formats define resources with offsets and sizes

• Unmentioned parts of the file are ignored

• Other files can occupy unmentioned space

Other formats indicate a total size of data to be parsed

• Any additional data is ignored

• Other files can simply be appended

Some formats indicate that unrecognized data is

ignored

• May still need to be formatted correctly

Unreferenced PDF object

PDF xref table, lists object offsets in the file

We first remove one reference

Next, we replace part of that object’s content…

Unreferenced PDF object

…with a 7zip file.

Copyright Trustwave 2010 Confidential

PDF / 7Z opened as a PDF

Copyright Trustwave 2010 Confidential

PDF / 7Z opened as a 7Z

Copyright Trustwave 2010 Confidential

PNG file format

• Static signature

• Series of chunks• IHDR chunk• Other chunks including at least one IDAT chunk• IEND chunk

Copyright Trustwave 2010 Confidential

PNG chunk format

• 4 byte length field

• 4 byte identification field

• Data

• 4 byte CRC of id field and data field

Chunks with unknown IDs will be ignored

The CRC will likely not even be checked

Copyright Trustwave 2010 Confidential

jaCK chunk

Copyright Trustwave 2010 Confidential

Start/End markers

Many formats use a magic byte sequence to denote the beginning of data

Similarly, many have one to denote the end of data

Data outside start/end markers is ignored• Files can be placed before or after such markers

− Files must not contain conflicting markers

Copyright Trustwave 2010 Confidential

Start/End markers

JPEG• Start marker: 0xFFD8• End marker: 0xFFD9

RAR• Start marker: 0x526172211A0700

PDF• Start marker: %PDF• End marker: \n%%EOF\n (\r and \r\n can replace \n)

PHP• Start marker: <?php• End marker: ?>

Copyright Trustwave 2010 Confidential

A WinRAR is you!

Copyright Trustwave 2010 Confidential

A WinRAR is also JPEG!

Copyright Trustwave 2010 Confidential

Limitations

Some formats use absolute offsets

• They must be placed at start of file or offsets must be

adjusted

• Examples: JPEG, BMP, PDF

Some have headers which indicate the size of each

resource to follow

• Such files are usually easy to work with

• Other files can be appended without breaking things

• Examples: RAR

Copyright Trustwave 2010 Confidential

Limitations

Some files are simply parsed from start to end

• Such files require some metadata, unreferenced space,

or data which can be manipulated to have multiple

meanings

Different parsers for the same format operate

differently

• Might implement different non-standard features

• May interpret format of files in different ways

Copyright Trustwave 2010 Confidential

TrueCrypt volumes

No start/end markersNo publicly known signature

• Parsed from start of file to end of fileNo metadata fieldsNo unused spaceData is difficult to manipulate

Copyright Trustwave 2010 Confidential

TrueCrypt volumes

Security Implications

Reasons why file piggybacking must be considered

(Read the first word in every sub-bullet on the next slide)

Copyright Trustwave 2010 Confidential

Security Implications

Data infiltration/exfiltration• Never check what .mp3 files pass in and out of your

network?− Gonna change that when you get back to the office?

Anti-Virus evasion• Give an AV a piggybacked file, it might apply the wrong

rules− You might not know that most AV applies

heuristics/signatures based on identified file format!

File upload pwnage• Up loading well-formed images that are also backdoors

is possible

Copyright Trustwave 2010 Confidential

Security Implications

Multiple file consumers• Different programs may interpret the file in different

ways− GIFAR issue

Parasitic storage• How many file uploads allow only valid images?

Disk space exhaustion DoS• Some image uploads limit uploads by picture dimensions• Size of the file may not actually be checked

Copyright Trustwave 2010 Confidential

File upload pwnage

Imagine a Web-based image upload utility

• It confirms that the uploaded file is a valid JPEG

• It doesn’t check the file extension

• It uploads the file into the Web root

• It doesn’t set the permissions to disallow execution

Code upload is possible if the file is also a valid JPEG

• This isn’t hard…

Copyright Trustwave 2010 Confidential

Anti-Virus evasion exercise

Check detection rates on Win32 netcat

Place it in an archive and check

Put junk data at the beginning of the file and check

Piggyback the archive onto the end of a JPEG and check

Change the file extension to .JPG and check

Copyright Trustwave 2010 Confidential

Check detection rates on netcat

Copyright Trustwave 2010 Confidential

Archive netcat and check again

Copyright Trustwave 2010 Confidential

Add junk at the beginning of the file

Copyright Trustwave 2010 Confidential

Piggyback the archive onto a JPEG

Copyright Trustwave 2010 Confidential

Change the extension to .jpg

Copyright Trustwave 2010 Confidential

LULZ netkitties

Copyright Trustwave 2010 Confidential

Data Infiltration

Take the previous example of a 7z attached to a JPEG• This will bypass lots of AV• Maybe also IDS/IPS

− Haven’t tested it

Copyright Trustwave 2010 Confidential

Data Exfiltration

• DLP will generally look for:• Type of files being communicated• Content of traffic• Communication properties

• These techniques allow for covert channels• With wide bandwidth• With some plausible deniability• In files which are

• Ordinarily harmless• Frequently passed

• Without breaking the piggybacked files’ usability

Copyright Trustwave 2010 Confidential

Parasitic storage

• Certain sites allow for file upload of specific formats

• File piggybacking essentially removes this limitation

• This technique has been used on 4chan (now fixed)

• Book sharing threads

• LOIC distribution

• CP distribution

• Still works on certain image sites

• Browsers automagically download images

• What if those images are also malware?

• Now all you need to do is figure out how to execute it…

Copyright Trustwave 2010 Confidential

Multiple File Consumers

• GIFAR issue• JAR appended to the end of a GIF• Browser loads the GIF• Old versions of JVM would recognize AND RUN the JAR

• Apache handling “file.en.php.png”• Passes file to PHP for preprocessing• Serves resulting output with

• a US english charset• MIME type of “image/png”

Copyright Trustwave 2010 Confidential

Disk Space Exhaustion DoS

• Imagine a file upload utility• It allows the upload of only 1x1 images

• For disk space reasons

• Append 2GB of junk onto the end of a 1x1 image• ???• NO DISK SPACE!!!

• Checking properties of the file format may not be sufficient

Protections

What can we do about this?

(Not much)

Copyright Trustwave 2010 Confidential

File upload with code

• Don’t upload in the Web root

• Don’t allow the user to control any part of the

filename

• Don’t set the perms to executable

• Don’t trust file properties

• Allow only one extension

• Allow only known good extensions

Copyright Trustwave 2010 Confidential

Anti-virus Evasion

• We could:

• Check for all valid file headers

• Performance hit

• Apply all signatures/heuristics globally

• Big freakin’ performance hit

• Identify by behavior

• This doesn’t work on gateway AV

Copyright Trustwave 2010 Confidential

Disk Space Exhaustion

• Don’t just check properties from the expected

format

• Nuff said

• Put some additional protection in place

• Disk quota

• Separate partition for uploads

Copyright Trustwave 2010 Confidential

Parasitic storage

• In metadata

• Remove metadata

• At end of file

• Parse out relevant format data and save as new file

• In unreferenced block or as part of real data

• Don’t upload files?

• Don’t allow unauthenticated file upload?

Copyright Trustwave 2010 Confidential

Questions?

Daniel CrowleyDcrowley@Trustwave.com

@dan_crowley

Recommended