Intel XMLThreat Whitepaper

  • Upload
    mobrob

  • View
    225

  • Download
    0

Embed Size (px)

Citation preview

  • 8/3/2019 Intel XMLThreat Whitepaper

    1/33

    Internal/External

    XML Stream

    Enterprise SaaS Cloud Computing

    AbstractThis document describes a comprehensive threat model for a new breed of threats

    based on XML content, including XML languages used in the Service Oriented

    Architecture (SOA) paradigm such as SOAP and the Web Services Description Language

    [WSDL]. It is imperative that Web/Enterprise 2.0 threats which are taking advantage

    of XML streams and processors are described in their totality. In todays environment,

    architectures and protocols are shifting towards XML and new sets of technology

    vectors are emerging such as REST and XML-RPC. With Web 2.0, new threats loom on

    the horizon and consequently new protection methods are required to defend the

    application layer consuming and serving XML streams. Ajax- and RIA-based applications

    (Flash and Silverlight) are redefining the usage of XML streams and bringing about a

    shift in the threat model.

    In addition to a new model, this document attempts to define the concept of XMLIntrusion Prevention (XIP) as an analog to traditional network-based intrusion

    prevention. A new type of threat called an XML Content Attack is defined, and examples

    are provided for each layer in the threat model. Also, this document attempts to use the

    problem of lost context between XML processing layers to characterize many of the

    security problems that arise during XML processing.

    Finally, this document encompasses the threat model for Web 2.0 with respect to the

    XIP framework. It is intended that this document will help in discovering and mitigating

    the rising number of threats in the Web 2.0 environment.

    Protecting Enterprise, SaaS& Cloud-based ApplicationsA Comprehensive Threat Model for REST, SOA and Web 2.0

    WHITE PAPER

    Intel SOA ExpresswayXML Intrusion Prevention

    Previous generation XMLfirewalls, inflexible hardware

    security appliances, and ESBs

    are not equipped to handle the

    XML based threats outlined is

    this paper. Service Gateways

    are presented as a new class

    of product that can be used to

    secure, mediate and scale

    SOAP or REST based services

    at the network edge in a

    dynamically changing Enterprise

    security perimeter.

  • 8/3/2019 Intel XMLThreat Whitepaper

    2/33

    2

    White Paper: XML Intrusion Prevention

    Table of Contents

    1.0 Changing Environment and XML Stream Utilization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

    1.1 XML Intrusion Prevention (XIP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

    2.0 Layered XIP Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

    2.1 Horizontal Threats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

    2.2 Vertical Threats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

    3.0 XIP Model for Application Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

    3.1 Web 2.0 Threats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

    3.2 Document Encoding Threats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

    3.3 Structural Threats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

    3.4 Grammar Validation Threats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

    3.5 Semantic Representation Threats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

    3.6 Semantic Implementation Threats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

    3.7 Algorithmic Threats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18

    3.8 XML Security Threats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

    3.9 External Entity Threats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

    4.0 Web 2.0 Threats and Countermeasures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

    4.1 XSS (Cross Site Scripting) with XML streams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .23

    4.2 CSRF (Cross Site Request Forgery) with XML streams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

    4.3 XML poisoning and bombing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .26

    4.4 In transit routing and revelation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

    4.5 XPATH manipulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .27

    4.6 XML node corruption and tampering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .28

    4.7 XML fault enumeration and leakage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .28

    4.8 Tampering with REST . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .29

    4.9 XML Bruteforcing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

    4.10 XML Data Access Layer Injections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

    Proposed Countermeasures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

    References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

  • 8/3/2019 Intel XMLThreat Whitepaper

    3/33

    3

    White Paper: XML Intrusion Prevention

    1.0 Changing Environment and XML Stream Utilization

    XML data format or stream is very popular with business-to-business API calls or with businesses that make direct calls to the services layer in an

    application architecture. Over the last decade, we have seen this dominate the enterprise environment. XML is a backbone for SOA and various

    different protocols have been built around XML stream. As shown in Figure 1, a Business Application is making a call to XML services through a

    firewall and consuming the service.

    Until the last couple of years, XML never achieved popularity in the end-client environment. But with the introduction of Web 2.0 frameworks,

    we have seen a drastic shift in communication methodologies. The XMLHttpRequest (XHR) object has empowered browsers to make direct XML

    calls to backend Web services, opening up new doors for XML streams. In the current environment, we are seeing both upstream and downstream

    communication in XML format between browsers and application layers. An applications presentation layer (UI) gets loaded in a browser and

    utilizes components such as Ajax or RIA and conducts direct communication with the services or business layer. Figure 1 shows a browser making

    a call to an application servers XML services through a firewall.

    In this changing environment, threats to XML streams and service layers exist on the following dimensions:

    1. XML threats specific to business to business (services) calls and APIs

    2. XML upstream (browser to services) attacks specific to Web 2.0 components and protocols.

    3. XML downstream (services to browser) threats specific to browsers and end client security

    Thus there is significant augmentation in the threat model in the current environment. We will be covering these dimensions throughout

    this paper.

    Figure 1 Application environment and XML streams

    Firewall(HTTP and XML filters)

    Internet

    Web 2.0 ClientBrowser Stack

    Ajax

    RIA (Flash)

    HTML/JS/DOM

    BusinessApplication

    insi

    SOA or XMLBased Services(XML-RPC, RESTor SOAP)

    DatabaseApplication Server

    Downstream XML Message

    Upstream XML Message

    Business

    ORClient (Browser)

    XML

    Services

    oti

  • 8/3/2019 Intel XMLThreat Whitepaper

    4/33

    4

    White Paper: XML Intrusion Prevention

    1.1 XML Intrusion Prevention (XIP)

    This document describes a comprehensive threat model for a new

    breed of threats based on XML content and effectively describes the

    requirements and scope for XML Intrusion Prevention (XIP). The threat

    model suggested here considers the sum total of XML processing as

    a layered architecture that fits between application layer transport

    mechanisms such as HTTP or SMTP and the application. Also, the

    client layer and the XML stream with application context are newly

    added layers which one needs to consider for the Web 2.0 space.

    This new XIP model is required due to the weakest link property of

    secure systems. In any secure system, an attacker will always attempt

    to target the weakest link. In the case of network security, the lower

    level is a more understood space, with a variety of mature products

    and technologies available for network intrusion prevention.

    The XML processing stack however, is a new area with clear

    weaknesses. These weaknesses are mostly due to the extensibility

    and verbosity of XML; the requirements for application-to-application

    communication using the SOA paradigm [WS-Arch]; and end client-to-

    application communication in the Web 2.0 paradigm. Inter-application

    communication using the SOA paradigm means that XML messages

    are more closely related to executable code, which opens them up to a

    broad array of semantic threats. In the case of the Web 2.0 paradigm,

    XML messages can be both inbound and outbound with specific

    content and context. In some cases, the XML message may contain

    executable code, an attached file, session information, context etc.

    In Web 2.0 framework an XML message can be mutated and more

    context sensitive, thus opening up a new set of threats for bothservices as well as end clients (browsers).

    Once XML processing is modeled as a network stack, it begins to

    inherit many of the security issues of a layered architecture, including

    the problem of lost context between processing layers. The problem

    of lost context is a repeated theme within the context of XIP and

    occurs when a subordinate layer fails to pass meta-information to an

    upper layer. In other words, the lower layer doesnt communicate all

    of the meta-data required for proper XML processing at the higher

    layer, resulting in a potential vulnerability. Some examples of this

    condition include the failure to pass encoding information, objectmodel information, XML security information, or data type information

    between layers.

    If we closely look at the client layer which is processing XML

    messages and data in a Web 2.0 framework, we can see that a set of

    vulnerabilities has opened. Browsers are now capable of processing

    both XML and scripted languages such as JavaScript. This cocktail

    makes the browser vulnerable to threats such as Cross Site Scripting

    (DOM-driven) or Cross Site Request Forgery (CSRF) with cross-domain

    calls. We can see exploitation of Ajax and RIA components with

    this set of vulnerabilities. One added vector which can easily breach

    the security is an un-trusted source of XML message and data, for

    example, a browser loading RSS feed (XML block) from a completely

    un-trusted source. Consequently, client side layer needs more

    sanitization for XML downstream.

    The acceleration advantage of these instructions comes from doing

    multiple operations in parallel: equal any compares each input string

    character against a set of characters, ranges compares each input

    string character against a set of character ranges, and equal ordered

    compares a set of substrings against the (shortened) reference string.

    1.1.1 XML Content Attacks

    XML Intrusion Prevention (XIP) is the theory and practice of protecting

    and mitigating XML Content Attacks. An XML Content Attack is any

    content within an XML document sent to an upstream endpoint

    that places the endpoint in a state that is beneficial to the attacker.

    It should be noted that XML Content attacks pervade all languages

    based around the XML meta-language, including SOAP and XML Web

    Services technologies [WS-Arch]. In the case of Web 2.0 applications

    running with their own services layer, both down/up stream XML

    messages need protection and mitigation. Merely providing upstream

    filtering is not enough. One needs to provide total security for

    downstream XML messages as well, which get consumed by Ajax or

    RIA. The responsibility for risk mitigation has shifted to the browser

    as well.

    Due to the highly-structured nature of XML and its constituent

    processors, XML content attacks may be inadvertent and result from

    software bugs or incorrect data models rather than malicious behavior.

    Also, not all XML content attacks are weaknesses in XML itself. In

    many cases, the weaknesses are a combination of implementation

    deficiencies and complex relationships between processing layers. In

    some cases, stream processor is based on protocol or architecture,

    particularly with respect to REST and other XML-based RPC calls.

    An attacker utilizing an XML content threat may generally rely on at

    least one of the following assumptions. These assumptions serve to

    make the distinction between an XML content attack and another

    type of attack:

    The upstream system parses the XML document using some sort of

    XML parser. The parser could be custom-built or off-the-shelf.

    The upstream system is strongly correlating function calls or method

    invocation to some part of the XML content. In other words, the sys-

  • 8/3/2019 Intel XMLThreat Whitepaper

    5/33

    5

    White Paper: XML Intrusion Prevention

    tem is, in a loose sense of the word, effectively executing a portion

    of the XML document.

    In Web 2.0 environments, downstream comes into play and XML con-

    tent coming from XML based services can have potential malicious

    content that can place an end client (browser) at risk and a poten-

    tially compromising position.

    Application layer XML processing and analysis defects are common

    with Web 2.0. In an enterprise application, Ajax/RIA forms the pre-

    sentation layer running in the browser and doing XML processing.

    while on the server end proprietary XML processing layer is written.

    This layer can contain support for XML protocols such as XML-RPC,

    REST or SOAP.

    1.1.2 The XIP Device

    This document makes reference to an XIP device, which is defined as

    a firewall that is designed to cover the threats within the XIP space.

    This section attempts to detail some of the key requirements of an

    XIP device.

    Any XIP device (XML Firewall) must meet at least two general

    requirements in order to successfully protect against XML Content

    Attacks: resiliency and content scrutiny.

    The resiliency requirement means that the XIP device must be

    resilient when processing pathological XML data. In other words, the

    device should not crash, fail or be driven into an inconsistent state.

    The content scrutiny requirement means that the XIP device must

    not pass along any content that may blow out an upstream or

    downstream entity or cause any type of unauthorized function call

    in an up/down stream entity. In other words, the XIP device must

    protect itself and all up/down stream entities that may process the

    XML content that flows through it.

    2.0 Layered XIP Model

    This section introduces a layered model for describing XML content

    attacks. The Layered XIP model fits into a traditional network model

    in between application level transport mechanisms such as HTTP and

    the consuming application.

    The purpose of the layered XIP model is to create a conceptual

    structure for understanding the XML content attack threat space. The

    XIP model describes classes of threats specific towards XML content

    and includes threats based around Service Oriented Architectures

    (SOA) and XML Web Services such as XML over HTTP.

    Figure 2. Layered XIP Model

    Application

    HTTP

    SSL/TLS

    TCP

    (E) semanticimplementation threats

    (D) semanticrepresentation threats

    (C) grammar validationthreats

    (B) structural threats

    (A) document encoding threats

    (iii)XMLsecuritythreats

    (ii)e

    xternalentitythreats

    (i)a

    lgorithmicthreats

  • 8/3/2019 Intel XMLThreat Whitepaper

    6/33

    6

    White Paper: XML Intrusion Prevention

    The layered XIP model has two unique properties that differentiate

    it from a traditional network model; it is naturally recursive and the

    threat space is multi-dimensional.

    The model is recursive due to the fact that any piece of XML content

    can be constructed and processed in layers. This means that each

    entity that processes a piece of XML content can only processcertain nodes or unravel certain parts of the document, leaving

    further processing to another node. This layered processing implies

    that the threat model must be fully re-applied to the well-formed

    subdocument in cases where the XML is not processed in full or by

    its ultimate receiver.

    The multi-dimensional aspect of the model refers to threats that

    are common to multiple horizontal layers, such as algorithmic

    threats or XML security threats. This is represented in the XIP

    model by showing both vertical and horizontal threats. In the multi-

    dimensional model there are specific threats at each level as well asthreats that span one or more levels. Horizontal threats are denoted

    with uppercase letters and vertical threats are denoted with lower

    case roman numerals.

    In addition to multi-dimensional threats, there are also boundary-

    layer threats that may occur between horizontal boundaries. These

    boundary-layer threats usually refer to a problem of lost context

    between layers (such as encoding scheme) which may be the source

    of an XML content attack.

    The practice of defending against XML content attacks begins with

    understanding the threat space. Each layer of the XIP model refers toa different threat space specified as follows:

    2.1 Horizontal Threats

    2.1.1 Encoding Threats: An encoding threat is any threat that takes

    advantage of lost encoding information between XML processing

    layers or imprecise encoding implementations. XML documents

    support many different encoding types including UTF-8, UTF-16,

    ISO-8859-1, and Windows-1252. These type differences can

    introduce subtle holes for an attacker.

    2.1.2 Structural Threats: Structural threats refer to oversize XML

    components such as elements, attributes, comments or nesting

    depth. Not all XML parsers behave consistently when handling well-

    formed documents. An attacker may take advantage of an untested

    code-path within an XML processing engine by exploiting structural

    threats.

    2.1.3 Grammar Validation Threats: Grammar validation is the

    process of comparing an XML instance to its defining language.

    The method of grammar validation is variable and may include

    DTD checking [XML], W3C Schema Validation [XML-Schema], Relax

    NG [RELAX-NG], and Schematron-based mechanisms. From an

    XIP standpoint, there are three general problems with grammar

    validation: First, grammar validation mechanisms are not complete

    in and of themselves in fully-specifying content models; second,

    certain grammar validation mechanisms (such as W3C schema) blurstandard object models such as XPath [XPath]; and third, many XML

    language definitions are built to be extensible, creating weak links

    for an attacker.

    2.1.4 Semantic Representation Threats: Representation threats

    refer to cases where XML represents remote procedure or document

    passing calls. In these cases the underlying language such as SOAP

    [SOAP11] or XML/RPC [XML/RPC] may be subject to semantic

    attacks where function parameters are altered to induce malicious

    behavior.

    2.1.5 Semantic Implementation Threats: Implementation threats

    refer to semantic changes within an XML document that rely on

    the actual implementation consuming the XML document. In these

    cases, the actual implementation may be custom program, such as a

    C or Java program, an XSLT engine, or a Perl script.

    2.2 Vertical Threats

    2.2.1 Algorithmic Threats: Algorithmic threats refer to

    implementations of standard processing algorithms, such as hash

    tables, that may be efficient in the average case but may degenerate

    into exponential space or time behavior with carefully chosen input.If an attacker can guess the data structure or general algorithm

    used, the attacker may be able to produce a Denial of Service (DoS)

    condition in the XML processing application. While these types of

    threats are not specific to XML processing, they are important for

    XIP due to the intense algorithmic processing that XML undergoes

    due to its highly structured nature.

    2.2.2 External Entity Threats: XML documents use URI [RFC2396]

    mechanisms to refer to external or document internal data and

    structures. These URI references are de-referenced not only during

    grammar validation, but also during application processing. To thiseffect, an external entity threat is any type of threat where an

    external URI is overloaded with malicious data that causes an XML

    application to de-reference an external source unnecessarily. This

    covers all types of external entity attacks, not just those based on

    the Document Type Definition.

    2.2.3 XML Security Threats: XML documents may have W3C

    security mechanisms such as XML Signature [XMLSIG] or XML

    Encryption [XENC] applied, either directly or under the profile of

  • 8/3/2019 Intel XMLThreat Whitepaper

    7/33

    7

    White Paper: XML Intrusion Prevention

    another specification such as OASIS WS-Security [WSS-Sec]. Because

    XML Signature and XML Encryption are in fact XML representations,

    they have the same inherent security problems as generic XML

    data, including attacks based on encoding and external entities.

    Further, specialized attacks such as the Davis Attack [DAVIS] are

    also possible due to the recursive nature of XML Encryption and XML

    Signature. Any threat that is capable of subverting or manipulationmessage-level security mechanisms without directly attacking the

    cryptography is considered an XIP-based XML Security threat.

    3.0 XIP Model forApplication Layer

    In Figure 2 we encountered threats below the application layer. Next,

    we need to decompose the application layer. Enterprise applications

    running in the Web 2.0 space use XML streams at both ends andvarious components to process these streams. An application layer

    can be dissected in two segments client and server. As shown in

    Figure 3 we have two layers.

    Clients layer: It is running in the browser and XML streams are being

    processed by several different components such as DOM, JavaScript

    engine, Plug-ins, XHR object etc. The application layer would have

    its own XML processing by leveraging XHR object or customized,

    developed modules or plug-ins. The XML stream is captured from

    HTTP and is passed to layers above in a browser model. A malicious

    XML stream directed at the browser can create a significant threat

    and can compromise end-client security.

    DOM

    JavaScript

    Plug-in (Flash/Silverlight)

    XMLHttpRequest (XHR)

    HTTP/TCP

    Data AccessCross

    Domain

    Proxy

    InternalInformation

    Source

    ExternalInformation

    Source

    Web 2.0 App Source Code

    XML Processor (Library)

    HTTP/TCP

    In-Transit pipe

    XMLStreams

    Figure 3. Application Layers for XML streams

  • 8/3/2019 Intel XMLThreat Whitepaper

    8/33

    8

    White Paper: XML Intrusion Prevention

    Server layer: The XML processor engine is created by libraries or

    customized code on the server end. Web 2.0 application sources

    would be leveraging these libraries and using protocols such as XML-

    RPC, SOAP or REST. All these protocols are XML-driven and capable of

    carrying payloads. The application source would be using data access

    layer to consume internal resources. Web 2.0 applications are unique

    in nature and applications require cross-domain calls. The browsercant bypass SOP (Same Origin Policy).One of the ways to bypass it

    by implements proxy on the server end. This proxy gets stream from

    targeted external resource over the Internet and passes information

    back to the application or client layer.

    One needs to analyze threats based on above model. Clearly threats

    are on three different segments:

    a. Application layer XML processing source

    b. Client or browser layer XML processing calls

    c. In-Transit threat for XML streams

    These threats are listed in next section.

    3.1 Web 2.0 Threats

    A Web 2.0 threat model for XML streams can have following set of

    threats and risks.

    3.1.1 XSS (Cross Site Scripting) with XML streams: XSS is a very

    common threat vector for Web 2.0 applications. XHR object takes

    XML stream and processes under the browsers DOM context. XML

    stream coming from server via proxy or database may have an

    executable script. This script is embedded in the XML payload and

    can cause damage to the end-client session. In certain cases, XML

    upstream can have malicious JavaScript which can cause persistent

    or non-persistent XSS as well.

    3.1.2 CSRF (Cross Site Request Forgery) with XML streams:

    XML streams originating from the browser can be easy prey for

    CSRF attacks. Services running and consuming XML streams are

    associated with some critical transactions and if CSRF is forced on

    the end- client, it can lead to potential compromise for identity.

    3.1.3 XML poisoning and bombing: XML streams injected in theapplication layer are processed by libraries and source code. If XML

    streams are poisoned by multiple and recursive nodes, it can affect

    server-side application layer. Some of the threats defined in section

    2 can be part of this layer.

    3.1.4 In transit-routing and revelation: XML streams in some

    cases can be routed using WS-Routing protocols. In those cases,

    if intermediate nodes are compromised, then entire streams can

    be accessed and manipulated. XML streams can be accessed and

    hijacked while in transit if protective measures are not taken.

    3.1.5 XPATH manipulations: XPATH is a popular means for querying

    XML documents. Components related to XPATH can reside in the

    application layer. If XPATH manipulations are done in XML streams,

    potential damage may be inflicted on the Web 2.0 resources.

    3.1.6 XML node corruption and tampering: XML nodes carry critical

    data, which can have logical processing nodes as well. It is possible to

    tamper with these nodes which can then breach the business-logic layer.

    Web 2.0 resources can be exploited by tampering with these nodes.

    3.1.7 XML fault enumeration and leakage: Web 2.0 resources

    process XML structures and if there are any errors during the

    process it emits fault which is embedded only in the XML stream.

    These faults can help an attacker enumerate vulnerabilities.

    This information leakage is another source for vulnerability

    substantiation.

    3.1.8 Tampering with REST: REST-based applications are running

    over the XML pipe only. REST inputs are on XML for certainoperations like CREATE and this operation can be tampered with.

    3.1.9 XML Bruteforcing: If an authentication process is proceeding

    over XML nodes, then at the application layer one can bruteforce the

    authentication streams. This is a common way of getting access to

    application.

    3.1.10 XML Data Access Layer Injections: In certain types of Web

    2.0 applications, Ajax calls may initiate direct XML streams and calls

    to backend data services. These streams become entry points for

    potential SQL interfaces. If these streams are not well guarded, it can

    lead to SQL exploitation.

    3.2 Document Encoding Threats

    3.2.1 Encoding Discrepancies (refer to section 2.1)

    HTTP headers utilize a charset parameter intended to specify the

    encoding for the content-type. In cases where the XML payload has

    a different encoding, the encoding specified in the HTTP parameter

    takes precedence.

    In cases where the XML payload is stored with the XML declaration,

    the original content-type in the HTTP parameters may be lost if thecontent-type isnt explicitly stored. The discrepancy in the encoding

    can lead to buffer overflow and misinterpretation conditions

    For example, consider the following HTTP POST and XML payload:

    POST /test/service1 HTTP/1.1

    From: [email protected]

    User-Agent: AttackClient/1.0

    Content-Type: text/xml; charset=iso-8859-1

    Content-Length: XX

  • 8/3/2019 Intel XMLThreat Whitepaper

    9/33

    9

    White Paper: XML Intrusion Prevention

    According to the HTTP parameters, the encoding of the XML

    document should be ISO-8859-1 not UTF-8.

    If we assume that the document () contains a division sign

    (ISO-8859-1 code 247,) the bits on the wire should be represented as

    follows:

    1111 0111 (0xF7)

    For ISO-8859-1 encoding this is perfectly valid for a division sign.

    If this encoding context is lost however, the character would be

    interpreted as UTF-8 per the XML declaration. In this case, the UTF-8

    rules would interpret the first five bits (1111 0) as a length indicator

    which would use a 4-byte sequence as the character representation

    instead of the one byte for the division sign.

    This implies that the bytes that follow the division sign in the original

    document would be interpreted as a single 4-byte UTF-8 character,

    effectively changing the syntax of the actual XML document. This

    occurs because the UTF-8 encoding is counting the three bytes that

    occur *after* the division sign in the ISO-8859-1 encoded document.

    This type of encoding discrepancy can occur on any character that

    looks like a multi-byte UTF-8 encoding based on the first four bits. For

    ISO-8859-1 this would be characters in the range 192 (0xC0) through

    255 (0xFF).

    An attacker may use knowledge regarding subtleties in how

    encodings are handled to alter XML documents in a malicious way,

    effectively changing the way the content is interpreted.

    3.2.2 Minimal Encoding Rule

    Encoding schemes such as UTF-8 allow for multi-byte encoding forms

    to support larger ranges of characters. In cases where XML parsers

    dont follow the minimal encoding rule, multiple bytes may be used to

    encode characters that only require a single byte value. For example,

    consider the ASCII character A which has a value of 64 (0x41). If the

    minimal encoding rule is not used, UTF-8 may represent the character

    A as a two byte sequence 0xC1 0x81 (11000001 10000001).

    XML processors that do not recognize non-minimal encoding properly

    may end up forwarding malicious content to other systems. This may

    include processing instructions or complete elements. This occurs in

    situations when the minimal encoding rule is not recognized during

    input, but is normalized during output.

    For example, the non-minimal 2-byte UTF-8 encoding for the left

    angle bracket is the value 0xC0 0xBC (11000000 10010000)

    malicious>

    /malicious>?dangerous pidata?>

    In the previous example, the element is formed by using a

    non-minimal encoding to represent the left angle bracket. This would

    be represented in the content as ASCII characters that correspond

    to 0xC0 and 0xBC, respectively. In this case, the hack applied to the

    left angle bracket will confuse a UTF-8 implementation that does not

    properly recognize non-minimal encoding and it will believe that the

    element contains just content and no elements or processing

    instructions. Note that only the left angle bracket must be hacked totrick the XML parser into believing the element contains text

    content.

    When the document is serialized and forwarded on, the

    normalization mechanism may recognize the two-byte sequences as

    an angle bracket and may insert markup that could be processed by

    downstream entities. In this scenario, it is important for an XIP device

    to ensure that if content is blocked at a perimeter, it enforces these

    rules for all back-end downstream entities.

    For example, it may be discovered that a particular XML processingnode has a security hole with certain processing instructions. This

    minimal encoding rule attack, if present, may create a way for an

    attacker to bypass an XIP device that is attempting to filter processing

    instructions or any other XML content.

    Example (Normalized):

    3.2.3 Multi-byte End of Buffer Sequences

    The attacks described in this section rely on the multi-byte encoding

    used by UTF-8 and UTF-16. Both of these attacks are speculative and

    require a sophisticated attacker with intimate knowledge of the XML

    parsing implementation.

  • 8/3/2019 Intel XMLThreat Whitepaper

    10/33

    10

    White Paper: XML Intrusion Prevention

    For UTF-16, most characters can be encoded using a single 16-bit

    word. For values greater than 2^16, a surrogate pair of two 16-bit

    words must be used. A sophisticated attacker could place the first

    word of a surrogate pair in an XML document where the attacker

    knows the XML processor will read at a buffer boundary. This may

    cause the XML processor to read past the end of the buffer, causing a

    possible buffer overflow attack.

    A similar situation occurs for UTF-8. For example, UTF-8 represents

    the length of the encoded octets in the first byte of the multi-byte

    sequence. For example, a five bit pattern of 11110XXX indicates a

    four byte UTF-8 sequence. If this appears as the last byte of input it

    is possible that an XML processor would attempt to read past the end

    of a buffer, causing a misinterpretation of the XML content or possible

    buffer overflow.

    3.3 Structural Threats

    Some of the keywords for most threats in this category include:

    coercive parsing, jumbo payloads, oversize payloads, XML Denial

    of Service (XDoS), node depth attacks, buffer overflow, and XML

    component attacks.

    An important feature of attacks in this category is that the XML is

    well-formed, meaning that it is considered syntactical valid by most

    parsers without limit enforcement.

    Each of the following examples show potential attacks in this

    category.

    3.3.1 Oversize Payload (refer to section 2.1)

    An oversize payload is any XML element that contains a large amount

    of data intended to cause a Denial of Service condition or buffer

    overflow condition during parsing. This type of attack is perhaps one

    of the most basic attacks that may be used to overwhelm a parser.

    A simple example of an oversize payload is a large amount of encoded

    data within an element.

    AdsG4d943wvcjur532MZ42Fdsj+2jfrws=2r45j2fwS

    DjgfdsFRjs24942309fsDxzjtr32ur539sdZxjfws0t5

    r432jdlsfff

    Another example of an oversize payload is an inordinately large

    number of elements within an XML document:

    3.3.2 Oversize Element Names, Attribute Names, ProcessingInstruction Target Names

    The size of an individual XML component may also be the target of an

    attack.

    An example of oversized element names, attribute names and

    processing instruction targets are shown as follows:

    3.3.3 Oversized Attribute Count

    An attacker may also attempt to break a parser by supplying a large

    number of attributes on a single element. For example:

  • 8/3/2019 Intel XMLThreat Whitepaper

    11/33

    11

    White Paper: XML Intrusion Prevention

    3.3.4 Deep Element Nesting

    An attacker may attempt to break a parser by supplying a deeply

    nested XML document

    For example:

    millions of nested elements may cause

    trouble with some XML parsers -->

    3.3.5 Oversized Comments, Character Data, Processing

    Instruction Data, and Attribute Values

    An attacker may attempt to break a parser by generating various

    oversized components. For example:

    An oversized comment such as this one may

    continue to hundreds or thousands of megabytes

    causing pathological conditions in an XML

    parser!-->432AdsG4d943wvcjur532MZ42Fdsj+2jfrws=2r45j2fwSD

    jgfdsFRjs24942309fsDxzjtr32ur539sdZxjfws0t5r432

    jdlsfff432AdsG4d943wvcjur532MZ42Fdsj+2jfrws=2r4

    5j2fwSDjgfdsFRjs24942309fsDxzjtr32ur539sdZxjfws

    0t5r432jdlsfff432AdsG4d943wvcjur532MZ42Fdsj+2jfr

    ws=2r45j2fwSDjgfdsFRjs24942309fsDxzjtr32ur539

    sdZxjfws0t5r432jdlsfff432AdsG4d943wvcjur532MZ

    42Fdsj+2jfrws=2r45j2fwSDjgfdsFRjs24942309fsDxzjt

    r32ur539sdZxjfws0t5r432jdlsfff

    -->

    sG4d943wvcjur532MZ42Fdsj+2jfrws=2r45j2fwSDjgfdsF

    Rjs24942309fsDxzjtr32ur539sdZxjfws0t5r432jdlsfsG4d943wvcjur532MZ42Fdsj+2jfrws=2r45j2fwSDjgfdsFRj

    s24942309fsDxzjtr32ur539sdZxjfws0t5r432jdlsf

    ]]>

  • 8/3/2019 Intel XMLThreat Whitepaper

    12/33

    12

    White Paper: XML Intrusion Prevention

    3.4 Grammar Validation Threats

    Some of the keywords for most threats in this category include:

    schema poisoning, extraneous declarations.

    Many of the attacks in this category highlight misconceptions regard-

    ing grammar validation as a security feature. In many cases, grammar

    validation fails to provide adequate protection for malicious content. Inother cases, lazy implementations may skip subtle requirements that

    can lead to security holes.

    Each of the following examples show potential attacks in this

    category.

    3.4.1 Schema Extensibility (refer to section 2.1)

    Many standard security schemas are extensible, meaning that it is

    permissible to insert arbitrary content into the data model without af-

    fecting the schema validity of the document instances. This is done to

    provide extensibility for the language defined by the schema.

    Some important examples of this include OASIS WS-Security [WSS-

    Sec], W3C XML Signature [XMLSIG], and W3C XML Encryption [XENC].

    These examples are highlighted because they are security centric

    schemas.

    Extensible schemas are schemas that make use of the any or

    anyAttribute element along with the lax attribute value for the pro-

    cessContents attribute.

    Below is an example from the OASIS WS-Security 1.0 schema that

    shows how the main Security header element is designed to allow any

    arbitrary content. The problematic areas are shown in bold:

    The use of any is to allow extensi-

    bility and different forms of security

    data.

    A simple Denial of Service attack using an oversized payload is shown

    below. It is important to note that this element is considered schema

    valid, based on the element definition.

    DoS

    DoS

    DoS

    DoS

    DoS

    DoS

    DoS

    3.4.2 Schema Type Coercion

    Some schema validation implementations may overlook type coercion.

    For example, consider the following element definition for a serial

    number:

    This schema defines an element called serialnumber to be of

    type integer.

    Consider now an instance document that uses the W3C schema rulesfor inline type coercion. This mechanism allows a type to be modified

    inline through the use of the type attribute.

    For example, because an xs:unsignedByte is a subtype of

    xs:integer, the following document is schema valid:

  • 8/3/2019 Intel XMLThreat Whitepaper

    13/33

    13

    White Paper: XML Intrusion Prevention

    xmlns:xs=http://www.w3.org/2001/XMLSchema

    xsi:type=xs:unsignedByte>153

    If the type coercion is pushed upwards towards a super type, a com-

    pliant schema validation engine is supposed to throw an error. For

    example:

    153.343

    Care must be taken when implementing schema validation rules. An

    attacker armed with these subtle rules may be able to bypass type-

    checking in non-compliant validation engines, effectively passing

    through XML content forbidden by the schema.

    3.4.3 Lazy Type Definitions

    Schema extension points may also be introduced through careless-

    ness when constructing schema models. For example, an element

    definition that neglects a type attribute will be implicitly defined as

    xs:any, allowing it to contain unbounded content. For example:

    The previous definition would make it possible to insert any arbitrarycontent within the payload element.

    3.4.4 Dirty Word Filtering

    W3C XML Schema has a feature to enforce regular expressions within

    content through the use of the facet. While this feature is

    useful for constraining the value space of XML content, it is not pow-

    erful enough to provide a document-level word filtering mechanism.

    For example, many XML processing systems may wish to filter content

    based on dirty words that may signal semantic attacks. For example,

    filtering on common SQL commands within XML content may mitigate

    SQL injection attacks.

    For example, consider the following SOAP realization of a function call

    that looks up sensitive information based on string-based region or

    city name.

    US WestCoast

    Assuming that the string US West Coast is passed to some sort of da-

    tabase engine such as SQL, it would be prudent to filter dirty words

    such as WHERE, SELECT, FROM, EXEC, 1=1 from the content, but this

    cannot be done with a regular expression using the facet.

    For example, consider an example facet to constrain an in-

    teger value to a 3-digit number from the value space of 0 through 9, a

    5, and then 0 through 3.

    This regular expression works well because the content being

    bounded is an integer. To write the same expression for a string would

    involve an explosion of regular expressions that would result in trying

    to map the entire allowable value space for all strings pertinent to this

    function.

    In other words, each possible string must be represented as a regular

    expression, with the intersection of all such possibilities being defined

    as the regular expression to appear in the facet. The limita-

    tion of the facet for XML schema occurs because there is

    no suitable way to express the negation of a regular expression or set

    of words, meaning W3C Schema Validation cant adequately provide

    word filtering without additional custom checking.

    3.4.5 Schema Model Mismatching

    Typical instantiations of XML processors make heavy use of XSLT

    [XSLT] for data transformation as well as for general XML process-

    ing. XSLT is itself based around the XPath Data Model [XPath] which is

    an object model that views an XML document as a tree consisting of

    root, element, text, attribute, namespace, processing instruction and

    comment nodes.

    W3C Schema Validation doesnt use the same XPath data model dur-

    ing validation time and is required to normalize processing instruction

    and comment nodes. This normalization can provide a subtle hole for a

    sophisticated attacker to bypass even custom length checks.

  • 8/3/2019 Intel XMLThreat Whitepaper

    14/33

    14

    White Paper: XML Intrusion Prevention

    For example, an XML processor that is enforcing limits on the size of

    text nodes may mandate a size restriction for element content using

    a custom enforcement mechanism. This type of limit enforcement

    should be used and is recommended to mitigate structural threats

    with oversized payloads.

    For example, consider the following SOAP request for employee infor-mation based on a unique identifier:

    0000012345A

    BC

    Now consider an object model-based limit enforcement mechanism

    employed prior to W3C schema validation that limits text nodes to

    10K. This would effectively prevent memory allocation during parsing

    to less than 10K for text nodes.

    Now consider a malicious document that exploits the fact that the

    XML processor is also performing schema validation:

    000012345ABC

    0000012345ABC

    0000012345ABC

    0000012345ABC

    0000012345ABC

    In the previous document an attacker has created a repeating se-

    quence of processing instructions and text nodes. Each text node

    alone is well under the 10K limit. The attacker can continue this way,

    inserting thousands of such pairs, without tripping the XPath based

    limit enforcement.

    When schema validation occurs, the contents of the CustID element

    will be normalized, effectively removing all processing instructions.

    Once these processing instructions are stripped, memory will be al-

    located to hold sum of all of the text nodes to continue schema

    validation. This is due to the fact that processing instructions and

    comments are invisible to schema validation, but individually each text

    node passes the limit enforcement test.

    The net effect is that an attacker can cause an XML processor to

    overcome its own limit enforcement if implementations arent mindful

    of how different technologies model XML data. This is another exam-

    ple of a boundary-layer problem of lost context between the XPath

    object model and the W3C schema validation model.

    3.5 Semantic Representation Threats

    Some of the keywords for most threats in this category include:

    Parameter Tampering, Routing Detours, WSDL Scanning/Enumeration,

    SOAP replay attack, Malicious Morphing, SOAP array attack, SOAP

    fault scanning, SOAP SQL injection.

    When an XML document is sent from one system to another, various

    languages can be used to represent the semantics of the document. A

    lightweight envelope defined by SOAP [SOAP11] is often used withinthe XML Web Services paradigm to represent remote procedure calls.

    Other mechanisms such as XML/RPC [XML/RPC] can also be used. All

    of the attacks in this section rely on the specific representation uti-

    lized within the XML message.

    3.5.1 SOAP Parameter Tampering (refer to section 2.1)

    SOAP represents function calls in a language and platform-neutral

    representation. A parameter tampering attack is simply an attack

    caused by the manipulation of the function parameters within the

  • 8/3/2019 Intel XMLThreat Whitepaper

    15/33

    15

    White Paper: XML Intrusion Prevention

    SOAP message. This parameter manipulation may cause unwanted

    function calls or Denial of Service conditions in the backend system.

    For example:

    Data1

    Data2

    Data3

    In the previous example an attacker has modified the size of the lin-

    ear array data structure to 10 million bytes. This array is passed to the

    PerformFunction method call when the SOAP call is executed.

    If the back-end system is mapping the supplied size of the array di-

    rectly to a memory allocation function, this type of message could

    cause the system to allocate nearly 10GB of memory, which may

    cause a Denial of Service condition.

    Another example of parameter tampering is a SQL injection attack,

    which tries to infiltrate a back-end database through the use of mis-

    placed SQL commands.

    For example:

    8123

    or 1=1 or password=

    In the previous example, the password field is overloaded with logic

    statements designed to bypass the password authentication mecha-

    nisms in a SQL database through the use of Boolean short-circuiting.

    It is important to note that most SQL injection-type attacks rely on

    software implementation bugs that fail to validate input before it is

    passed into a database query. For example, stripping quotes, white-

    space and logic operators before processing can mitigate these types

    of attacks.

    3.5.2 SOAP Encapsulation

    SOAP uses a distinct header block to carry metadata about the mes-

    sage, such as security semantics for digital signatures and encryption.

    Within the header block, SOAP defines an attribute called mustUnder-

    stand which can be used to force a SOAP node to process header

    items, such as digital signatures. The semantics of mustUnderstand

    are clearly defined by SOAP v1.1 [SOAP11] as follows: If the value of

    the attribute is 1, processing the header entry is mandatory, and if it

    is 0, processing the header is optional.

    Further, the SOAP header is designed to be mutable and many inter-

    mediaries along the path may add or change the header according

    to routing and security rules. An attacker may be able to take advan-

    tage of mutable headers to force a SOAP processor to ignore a digital

    signature.

    For example:

    bglnqGibA6DiBAexBCksjX+nzhI=

    FUMuj800+69dzGmrbOuxMp7OcK4ZKy1DG6s9VWP

  • 8/3/2019 Intel XMLThreat Whitepaper

    16/33

    16

    White Paper: XML Intrusion Prevention

    12345610000

    The previous example shows a SOAP document containing a digital

    signature as specified by OASIS WS-Security [WSS-Sec]. Many of the

    elements such as the security token, timestamp, and other details are

    omitted for brevity.

    The reader should notice the fact that the mustUnderstand attribute

    is set to 1, effectively forcing the SOAP node to process the digital

    signature before passing the body to the application.

    It is important to note that in this case the header is not covered by

    the digital signature, meaning that it is possible for an attacker to add

    or remove elements from the header at will.

    For example:

    bglnqGibA6DiBAexBCksjX+nzhI=

    FUMuj800+69dzGmrbOuxMp7OcK4ZKy1DG6s9VWP

    123456

    10000

    In the previous example, the body of the SOAP request is still digitally

    signed, but the attacker has wrapped the Security element in an un-

    recognized namespace and set the processing to optional. A SOAP

    node that processes this message in a layered manner may not rec-

    ognize the signature because the semantics of the mustUnderstand

    attribute on the outer envelope instruct it to ignore the processing.

    This is another example of a boundary-layer problem caused by the

    SOAP envelope and the OASIS WS-Security framework. The fact that

    the SOAP document has a mandatory signature is lost due to this extra

    meta-information not being passed through to the outer SOAP layer.

    3.5.3 SOAP Fault Scanning

    SOAP has a built-in fault mechanism that allows a SOAP message to

    contain fault information for bad requests. This fault information is

    contained with in a Fault element and may contain integer codes or

    complete text descriptions.

    In many cases, an attacker may be able to perform reconnaissance on

    a back-end system based on information in returned by SOAP faults

    such as stack traces or operating system level errors. An XIP device

    that is protecting back-end systems should ensure that SOAP faults

    are not passed back through to clients that may aid an attacker.

    For example:

    HTTP/1.1 500 Internal Server Error

    Content-Type: text/xml; charset=utf-8

    Content-Length: nnnn

  • 8/3/2019 Intel XMLThreat Whitepaper

    17/33

    17

    White Paper: XML Intrusion Prevention

    -1

    System.Web.Services.Protocols.

    SoapException:

    Server was unable to process request

    The previous exception may seem benign, but the use of back-end spe-

    cific information tells an attacker that the service uses Microsoft .NET

    based on the reference to the Microsoft specific System.Web.Services

    assembly. Other examples may include Java stack traces, which tell and

    attacker that the back-end service is implemented in Java.

    3.5.4 WSDL Enumeration/Guessing

    The XML Web Services paradigm utilizes an XML-based interface de-

    scription mechanism based on Web Services Description Language

    (WSDL [WSDL]). In many cases, a WSDL may undergo changes as

    methods are added and removed based on the lifecycle of the under-

    lying XML web service. Over time, the set of service interfaces may fall

    into a natural partition based on access control groups.

    For example, a certain subset of a WSDL may be communicated to exter-

    nal partners, while the complete set of method calls is only known to a

    few involved parties. Under this assumption, an attacker may be able to

    guess at other WSDL methods based on naming conventions. This type

    of attack is similar to port scanning at the network layer, but instead of

    searching for open TCP ports, an attacker searches for method calls.

    In many cases, extremely sensitive method calls should have strict

    authentication requirements, but in the early stages of XML web ser-

    vices adoption, it is likely that security for sensitive method calls will

    be implemented with a security by obscurity policy. This essentially

    means that the security of the methods is based solely on the fact

    that the names are known to a small set of parties.

    3.5.5 SOAP Replay Attack

    A SOAP replay attack is not related to cryptographic message replay.

    Instead, the idea behind this type of attack is to overwhelm a back-

    end system and cause a Denial of Service condition. An attacker may

    be able to capture valid encrypted and signed SOAP documents and

    send copies of these repeatedly to an XML processor.

    For example, consider a signed and encrypted valid SOAP request

    that utilizes OASIS WS-Security. If the payload is large enough, sim-

    ply repeating the valid request over and over again will force the XML

    processor to perform expensive XML security operations such as W3C

    XML Signature and W3C XML Encryption.

    This type of Denial of Service attack wont be detected by traditionalfirewalls because each request is valid in and of itself and it will gener-

    ally take a smaller number of such encrypted and signed requests to

    overwhelm an XML processor. This implies that rate shaping or volume

    limits enforced by traditional IPS/Firewall systems may not be tripped.

    3.6 Semantic Implementation Threats

    The category of semantic implementation threats refers to any type

    of content-based threat that is not clearly tied to a standard rep-

    resentation mechanism such as SOAP. The attack may be highly

    implementation- and environment-specific, relying on intimate knowl-

    edge of the back-end data model, architecture and language choice.

    Some of the keywords for most threats in this category include: XML

    Encapsulation, Code Injection, Command Injection, Buffer Overflow,

    Cross-Site Scripting

    3.6.1 XML Encapsulation (refer to section 2.1)

    An XML encapsulation attack is an attack designed to force a back-

    end system to execute malicious code or commands. In many cases

    this is done by passing the pathological content or code within a char-

    acter data section (CDATA) within an XML document.

    For example:

    x=new ActiveXObject(WScript.Shell);

    x.Run(%systemroot%\\SYSTEM32\\CMD.EXE /C

    format C:);

    ]]>

  • 8/3/2019 Intel XMLThreat Whitepaper

    18/33

    18

    White Paper: XML Intrusion Prevention

    In many cases, simply passing arbitrary code or command line calls

    in a CDATA section will have no effect. This attack becomes signifi-

    cant when the attacker has performed significant reconnaissance on

    how the target system handles CDATA sections based upon the en-

    vironment and implementation. The pathological code may harm the

    system, tie up resources or install backdoors for a later attack.

    Other examples of semantic implementation threats are based on how

    the target is implementing SOAP requests. In some cases, if an attacker

    determines the SOAP request is parsed and fed into a script interpreter,

    simple UNIX-style command line injection may pose a threat.

    For example:

    123456 | rm * -r

    In the previous example, the SOAP layer may be processed by a script-

    ing engine, such as Perl. And without validation of input, malicious

    behavior may be introduced with carefully chosen input.

    3.7 Algorithmic Threats

    Algorithmic threats are threats based on inefficient XML processing

    algorithms. In many cases, an inefficient algorithm simply causes slow

    software. In the case of XML processing however, an inefficient algo-

    rithm can be the cause of a Denial of Service attack. In some cases,

    algorithms and data structures that have average case-efficient op-

    eration can be forced into worst case behavior based on carefully

    chosen input.

    3.7.1 Hash Table Attack

    Many XML processing implementations make use of hash-table data

    structures when parsing and validating XML. For example, mappingnamespace prefixes (such as soap) to the namespace name (http://

    schemas.xmlsoap.org/soap/envelope/) is often implemented with a

    hash table. For grammar validation, W3C schema validation engines

    often use hash tables to keep track of global elements and global

    schema types.

    A hash table has algorithmic performance advantages for insertion

    and retrieval. In the average case, the time required to search a hash

    table is approximately constant. In many cases, however, the hash

    function used to determine the appropriate bucket is not cryptograph-

    ically secure. If an attacker can successfully guess the hash function

    used, either by brute force or by examining open source code, the at-

    tacker may be able to mount a successful Denial of Service attack.

    The attack would work by having the attacker choose well-formed

    XML components that repeatedly hash to the same bucket, causing

    the hash-function data structure to revert to its worst case behavior.This type of attack is particularly malicious because it is able to bypass

    well-formed-ness checks as well as limit-enforcement checks if the at-

    tacker can find small enough XML components.

    3.7.2 Exponential Space Algorithms

    Some XML processors may make use of algorithms that have worst-

    case exponential behavior. This behavior may be exponential time or

    exponential space. An attacker can easily exploit such an algorithm by

    constructing well-formed, but malicious input.

    For example, consider the following XML schema fragment that refersto a global element .

    During schema analysis time, this schema fragment must be con-

    verted into logic that is used to validate incoming XML documents.

    To accomplish this, many implementations unroll the sequence from

    the inside-out, causing the validation engine to generate code for

    each level in the tree. For this example, code will be generated that

  • 8/3/2019 Intel XMLThreat Whitepaper

    19/33

    19

    White Paper: XML Intrusion Prevention

    initially checks for the global element A, then code will be gener-

    ated that checks for at least one occurrence of but not more

    than two. Then, because the sequence is wrapped in yet another se-

    quence, the number of combinations increases by a power of two.

    For this particular example, the number of combinations reaches 27,

    which is manageable, but it would not be hard for an attacker to nest

    sequences up to 100, which would require 2100 states. This type ofbehavior in an algorithm implementation can easily cause a Denial of

    Service or out-of-memory condition in a back-end Web service.

    While this particular case shows a potential weakness in a schema vali-

    dation implementation, it may be argued that this code path may be

    triggered by a valid XML message (such as a SOAP document) that trig-

    gers schema validation or remote de-referencing of a poisoned schema.

    3.8 XML Security Threats

    This section is concerned with potential security hazards that result

    from the misuse and misunderstanding of XML security standards

    such as W3C XML Signature, OASIS WS-Security, and W3C XML

    Encryption. This section does not deal with any attacks based on the

    breaking of cryptographic protocols or ciphers, but instead focuses on

    problems that arise due to the recursive nature of the XML security

    standards. In other words, this section looks at threats based on the

    boundary layer problem or lost context problem between an XML doc-

    ument, the XML Encryption layer, and the XML Signature layer.

    This category overlaps with a larger category of attacks that are more

    strongly related to applied cryptography. This section does not at-

    tempt to cover threats based on message replay, trust validation,

    man-in-the middle attacks, password-based encryption, hash-based

    key derivation, private key protection, pseudo-random number genera-

    tion, password guessing, or PKI-based attacks.

    3.8.1 Davis Attack

    The Davis attack [DAVIS] is an attack based on context lost between

    the XML Signature and XML Encryption layers. The Davis attack has

    two variations. In the first variation, a signed and encrypted SOAP re-

    quest is made to look like it is intended for an alternate recipient. In

    the second variation, an encrypted and signed SOAP request is madeto appear as if it comes from an alternate sender.

    Variation #1 of the attack requires that the Recipient B be a double-

    agent.

    For example:

    ..

    In the previous example, assume that Sender A first signs and then

    encrypts the body of a SOAP request to send it to Recipient B.

    The order of operations is represented by the fact that the

    EncryptedKey element appears first in the Security header, meaning

    the encryption operation happened second and the signature opera-

    tion happened first.

    The important thing to note is that there is no connection between

    the signature and encryption operations. They occur in succession

    first the signature and then the encryption. When the message is

    processed, the recipient generally will assume that the signer of the

    message is also the same person who encrypted it, but there is noth-

    ing explicit about this assumption.

    Recipient B can fully decrypt the message, but doesnt have to strip

    the signature. In this case B may attempt to route the message to a

    different recipient, X. In doing this B is attempting to convince an al-

    ternate Recipient X that the message was signed by A for X. In reality,

    however, the message is signed by A for B. This effect may convince

    Recipient X that the original Sender A intends a signed message for X.

    One consequence of this type of attack may result in the re-routing of

    a financial transaction to an alternate recipient.

    To execute the attack, the eavesdropper B simply needs to decryptthe portion intended for him or her. Once this happens, the SOAP re-

    quest will contain only the original signature.

    For Example (shown with namespaces and superfluous elements

    removed):

  • 8/3/2019 Intel XMLThreat Whitepaper

    20/33

    20

    White Paper: XML Intrusion Prevention

    ..

    At this point, Recipient B simply needs to retrieve the public key for

    Recipient X and re-encrypt the message for X. When X receives the mes-

    sage, it will appear as if the original signer intends the message for X.

    ..

    The previous example looks the same as the first. The only difference

    is that the public key used to encrypt the body of the SOAP docu-

    ment belongs to Recipient X. When Recipient B sends the message

    back into the network, Recipient X may be fooled into believing it was

    signed by Recipient A explicitly for X.

    3.8.2 SOAP Body Subset Signatures

    The XML Signature Recommendation and its profile by OASIS WS-

    Security allows for any proper XML subdocument within an XML

    document to be the target of a signature. In the case of SOAP docu-

    ments, it is common to specify a signature over the entire SOAP body.

    The following snippet shown an example with superfluous elements

    and namespaces removed:

    SOAP body -->

    ..

    Further, it is also possible to sign any subdocuments within the SOAP

    body. If an application isnt careful, it may only sign a single element

    within the SOAP body. For example:

    SOAP body -->

    ..

    In the previous example, the element that is the target of the sig-

    nature is authenticated, but the SOAP body is not. The back-end

    application should never assume that just because a SOAP document

    contains a digital signature, the document is authenticated. In the

    previous example, it is possible for an attacker to place arbitrary XML

    content into the SOAP body without violating the digital signature.

    For example:

    SOAP body -->

    ..

  • 8/3/2019 Intel XMLThreat Whitepaper

    21/33

    21

    White Paper: XML Intrusion Prevention

    This potential threat may also be characterized as a lost context prob-

    lem. The threat is meaningful when the back-end application loses the

    meaning of the security context. In this case, it loses the fact that the

    XML Signature is only covering a portion of the SOAP body.

    3.8.3 SOAP Attachment Insertion Threat

    The SOAP with Attachments W3C Note [SwA] describes a mechanism to

    send a main SOAP payload with one or more attachments. Attachments

    need not be XML, but can be any registered MIME content type. SOAPpayloads with attachments are called SOAP message packages.

    OASIS WS-Security defines a profile for attachments called the OASIS

    WS-Security SOAP with Attachments (SwA) Profile [WSS-SwA]. This

    profile describes how attachments can be encrypted, signed, or both

    using mechanisms defined by XML Signature and XML Encryption.

    Back-end Web services may rely on a SOAP payload with a set of au-

    thenticated attachments for processing. The threat described here

    involves an attacker inserting an attachment into the SOAP package.

    If the back-end application doesnt recognize the inserted attachment

    as a rogue attachment, it may process the attachment believing erro-

    neously that because other attachments are signed, this attachment

    is also valid. In some cases, the attachment may be a binary payload or

    Trojan designed to infiltrate the system.

    For example, consider the following SOAP message package and at-

    tachments with the attachments authenticated using the OASIS

    WS-Security SwA profile. Note that this example omits namespaces

    and many required elements for the sake of brevity:

    Content-Type: multipart/related;

    boundary=signed; type=text/xml

    --signed

    Content-Type: text/xml

    --signed

    Content-Type: image/jpeg

    Content-Id: Content-Transfer-Encoding: base64

    Dcg3AdGFcFs3764fddSArk

    --signed

    Content-Type: image/jpeg

    Content-Id:

    Content-Transfer-Encoding: base64

    Fdsr4532rfdwr532jdsFDSfds

    --signed

    Content-Type: image/jpeg

    Content-Id:

    Content-Transfer-Encoding: base64

    D43jfds432hjrfswf

    In the previous example, the main SOAP payload is authenticating two

    attachments at the boundaries attachment1 and attachment2. The

    attacker has inserted a third attachment that uses the same bound-ary name and naming conventions as the authenticated attachment,

    but in this case it contains a rogue payload and is not covered by the

    digital signature.

    The application must be absolutely certain that it has proper policies

    in place for handling attachment insertion. A nave application may be

    written to process all attachments first, before attempting to authen-

    ticate them, which may result in a security hole.

    The attachment insertion threat is another example of a potential

    lost context problem. In this case, the application has to ensure that

    it maintains security context throughout processing. That is, it must

    maintain the fact that only signed attachments are valid.

    3.9 External Entity Threats

    An external entity threat is any type of threat within an XML docu-

    ment that forces a system to de-reference a URI or open a remote

    socket, file, pipe, or communication mechanism. The attacker could be

    making an attempt to cause a Denial of Service condition, or could be

    trying to download external code for execution on the host machine.

  • 8/3/2019 Intel XMLThreat Whitepaper

    22/33

    22

    White Paper: XML Intrusion Prevention

    Due to the fact that URIs are present across all layers of an XML doc-

    ument, this is a vertical threat. That is, an external entity threat may

    occur during grammar validation, XML security processing, SOAP pro-

    cessing or custom application processing.

    Some of the keywords for most threats in this category include: XXE

    Attack, External Entity Attack, Schema Poisoning.

    3.9.1 DTD Based External Entity Attack

    This type of attack relies on the use of the SYSTEM keyword in a

    Document Type Declaration (DTD) to force an XML processor to open an

    external socket and replace XML entities with extraneous XML content.

    For example, the following entity declaration within a DTD defines an

    entity called nothing to point to an external file.

    http://some-server.com/hugeXMLfile.xml >

    If an attacker can manage to set the entity declaration for a particu-

    lar XML message or file, the entity can simply be referenced within the

    XML content as follows:

    There is &nothing; wrong here. &nothing;

    from &nothing; breeds &nothing;

    In the previous example, the XML content will reference the entity

    four times, possibly causing a sever Denial of Service condition if the

    external file is large enough.

    i.2 Schema Based External Entity Attack

    W3C XML Schema defines a schemaLocation element which speci-

    fies an external location from which to retrieve schema definitions. In

    some cases, it may be possible for an attacker to spoof the identity of

    the remote server or local file, effectively forcing the schema valida-

    tion mechanism to reference a poisoned schema.

    For example, consider this fragment of the FpML schema definition.

    FpML is the Financial Products Markup Language, which is used in the

    financial services industries.

    In the previous example, this portion of the FpML schema uses two ex-

    ternal entities as defined by the include elements. In this case they refer

    to local files, but the attribute value for schemaLocation may also be a

    fully qualified URI such as http://www.fake-server.com/fpml-doc-4-0.xsd.

    If an attacker can find a way to substitute one of these files with

    an alternate one, they may be able to add extensibility points to the

    schema definition.

    3.9.2 XML Security Based External Entity Attack

    XML Security standards such as XML Signature and XML Encryption

    use URI mechanisms to reference data to be signed and encrypted. In

    many cases, the URI values utilized point to XML fragments within the

    current document. For example, SOAP documents that are signed with

    the OASIS WS-Security specification make frequent use of URI frag-

    ment identifiers which identify the signed portions based on Id values

    or the result of XPath expressions.

    In addition to local URI references, XML Signature and XML Encryption

    also fully support remote references, which may call for an external

    socket connection to retrieve some data to be signed or encrypted. An

    attacker may insert remote references into signed SOAP documents

    to cause a Denial of Service condition. The modification of the signed

    message in this case will go unnoticed because the attacker is insert-

    ing extra processing steps during the signature verification process.

    For example:

  • 8/3/2019 Intel XMLThreat Whitepaper

    23/33

    23

    White Paper: XML Intrusion Prevention

    Algorithm=http://www.w3.org/2000/09/

    xmldsig#sha1 />

    bglnqGibA6DiBAexBCksjX+nzhI=

    FUMuj800+69dzGmrbOuxMp7OcK4ZKy1DG6s9VWP

    123456

    10000

    In the previous example, the attacker has modified the Reference el-

    ement to de-reference an external executable file. Again, because

    the Reference element is processed during signature verification, the

    change wont be detected until it is too late. In the example shown here,

    the attacker may intend for a Denial of Service attack, in which case

    the target file may be huge. In other cases, the attacker may be able toforce an external code download such as a virus, Trojan, or back-door.

    4.0 Web 2.0 Threats andCountermeasures

    4.1 XSS (Cross Site Scripting) with XML streams

    Downstream coming from an application to a browser can have ma-

    licious payload embedded in XML streams. These XML streams get

    parsed and used in the browsers DOM. If a malicious source gets exe-

    cuted in the browser then it causes cross site scripting (XSS).

    For example, Web 2.0 applications are designed to serve cross domain

    content as part of their features. An end-user can ask the application

    to show various blogs at a single place inside the application region

    and for that the targeted Web 2.0 application will go out and fetch

    bloggers profiles along with their recent posts. Blogs support API calls

    and they return XML streams back to the Web 2.0 application. These

    streams are then sent to the end-clients browser. Browser is process-

    ing XML streams using XHR object as shown below. This is to process

    a bloggers profile based on a parameter.

    function getXMLProfile()

    {

    var http;

    if(window.XMLHttpRequest){

    http = new XMLHttpRequest();

    }else if (window.ActiveXObject){

    http=new ActiveXObject(Msxml2.XMLHTTP);

    if (! http){

    http=new ActiveXObject(Microsoft.

    XMLHTTP);

    }

    }

    http.open(GET, ./profile.

    jsp?user=jack, true);

    http.onreadystatechange = function()

    {

    if (http.readyState == 4) {

    var xmlmessage = http.responseXML;

    var profile = xmlmessage.getElementsByT

    agName(profile);

    var firstname =

    profile[0].getElementsByTagName(firstnam

    e)[0].firstChild.data;

    var lastname =

    profile[0].

    getElementsByTagName(lastname)[0].

    firstChild.data;var number =

    profile[0].getElementsByTagName(number)

    [0].firstChild.data;

    document.open();

    document.write(firstname+
    );

    document.write(lastname+
    );

    document.write(number);

    document.close();

    }

    }

    http.send(null);

    }

    Above code will make HTTP request over XHR and get profile for User

    Jack as shown below.

    HTTP/1.1 200 OK

    Content-Length: 164

    Content-Type: text/xml

  • 8/3/2019 Intel XMLThreat Whitepaper

    24/33

    24

    White Paper: XML Intrusion Prevention

    Last-Modified: Sat, 09 Aug 2008 06:35:38 GMT

    Accept-Ranges: bytes

    ETag: e6c1fe22eaf9c81:1005

    Server: Microsoft-IIS/6.0

    X-Powered-By: ASP.NET

    Date: Mon, 04 May 2009 08:53:27 GMT

    Connection: close

    Jack

    Smith

    212-252-5436

    The Javascript function is processing an incoming XML stream and

    taking values and passing them to document.write call, which will up-

    date the browser and inject new content in the current DOM. In this

    case, the function is vulnerable and malicious XML stream can lead to

    potential DOM based XSS. If an attacker has identified this weakness,

    then as a blogger, the attacker can inject a script inside his profile as

    below. Once that malicious XML stream gets called and loaded, it be-

    comes a potential exploit point for an attacker. An attacker can steal

    cookie information or run malicious commands on the browser.

    HTTP/1.1 200 OK

    Content-Length: 203

    Content-Type: text/xmlLast-Modified: Wed, 05 Sep 2007 08:38:53 GMT

    Accept-Ranges: bytes

    ETag: 58ceb13098efc71:1005

    Server: Microsoft-IIS/6.0

    X-Powered-By: ASP.NET

    Date: Mon, 04 May 2009 08:58:26 GMT

    Connection: close

    John

    alert(XSS)]>

    212-675-3292

    Shown above is just one instance of such an XML stream. Web 2.0 appli-

    cations use various sets of XML documents as standard ways of sharing

    information. For example, RSS and ATOM feeds are other popular data

    structures. It is possible to pollute RSS feeds, and once they get loaded in

    a customized RSS feed reader they can cause cross-site scripting. Here is

    an example of an XML node which is part of an RSS feed.

    ABC Corp News Feed

    Copyright 2007, ABC Media, Inc.

    en-US

    http://abc.example.com/rss

    Interesting news item

    javascript:alert(RSS - XSS)

    XYZ news

    2008-11-16T16:00:00-08:00

    In the above case, link node is polluted by injecting JavaScript. This link

    gets loaded as part of an RSS feed reader and if the XML stream is

    not sanitized, then that link will cause XSS in the browser.

    Consequently, XIP needs to secure downstream in Web 2.0 architec-

    ture. It is imperative in order to protect end users session. End users

    browsers, rather than operating systems, are becoming popular end-

    points for attack in the current environment. XML streams scrutiny is

    required before processing in the DOM. Above fundamentals are ap-

    plicable to various structures like JSON , JS-Array etc. Similar types of

    vulnerabilities can be observed in RIA applications running over Flash

    or Silverlight. It is very critical to filter both upstream and downstream

    XML structures and to block any executable payload residing in the

    form of JavaScript.

  • 8/3/2019 Intel XMLThreat Whitepaper

    25/33

    25

    White Paper: XML Intrusion Prevention

    4.2 CSRF (Cross Site Request Forgery) with XML

    streams

    Cross Site Request Forgery (CSRF)is becoming another critical attack

    vector for client-side security. It is possible to bypass SOP (Same Origin

    Policy) of a browser by using certain tags like a script, iframe or action

    (form). If an XML stream is doing certain sensitive tasks like transactions

    or password changes, then it is possible to craft an HTML page in such a

    way that HTTP requests get initiated without end clients consent.

    For example, User A logs into his banking site by passing right creden-

    tials. He makes a transfer of money to User Bs account. The banking

    application is running with XML services layer. The transfer request will

    come in the form of XML and since User A is authenticated, his cookie

    will get replayed as well. The transfer will take place successfully.

    Here is a transfer request for user A

    POST /login/transfer.rem HTTP/1.0Host: bank.org

    User-Agent: Mozilla/5.0 (Windows; U; Windows

    NT 5.2; en-US; rv:1.9.0.1) Gecko/2008070208

    Firefox/3.0.1

    Accept: text/html,application/

    xhtml+xml,application/xml;q=0.9,*/*;q=0.8

    Accept-Language: en-us,en;q=0.5

    Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7

    Content-Length: 210

    Content-Type: application/xml; charset=UTF-8

    Pragma: no-cache

    Cache-Control: no-cache

    Cookie: cid=213123454367439

    bank.transferACT234578901200

    In the above HTTP POST request User A transfers US$1200 to ac-

    count for ACT23457890. All transfer information is part of the XML

    stream and the call is of a typical XML-RPC type. The application is

    making a decision regarding transfer based solely on the cookie sent

    as part of the request. XML call is initiated from the browser using

    Ajax via XHR object. Ajax call defines a content type in the following

    HTTP directive.

    Content-Type: application/xml; charset=UTF-8

    Now it is possible to exploit this scenario. An attacker can create a

    dummy HTML page and host it on a cross domain. The following page

    is hosted on attacker.org.

    document.transfer.submit();

    As soon as User A visits this page, the browser will initiate HTTP POST

    request for transferring US$1500 to account for ACT33251800. This

    request is forcefully generated without a consent. The HTTP request

    would look as mentioned below:

    POST /login/transfer HTTP/1.0

    Host: www.bank.org

    User-Agent: Mozilla/5.0 (Windows; U; Windows

    NT 5.2; en-US; rv:1.9.0.1) Gecko/2008070208

    Firefox/3.0.1

    Accept: text/html,application/

    xhtml+xml,application/xml;q=0.9,*/*;q=0.8

    Accept-Language: en-us,en;q=0.5

    Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7

    Referer: /login/transfer.html

    Content-Type: text/plain

    Content-Length: 212

    bank.transferACT234578901500

    The only difference in the HTTP request is its content type. Since it

    is generated from the form and not XHR object, it is Content-Type:

    text/plain.

  • 8/3/2019 Intel XMLThreat Whitepaper

    26/33

    26

    White Paper: XML Intrusion Prevention

    Consequently, XIP needs to protect upstream XML structures hit-

    ting the services layer. One needs to validate the origin and type of

    the content. Application layer validations can be put in place by using

    CAPTCHA or unique tokens. This type of CSRF call can be initiated for

    any type of XML stream. It is possible to manipulate SOAP- or REST-

    based applications as well.

    4.3 XML poisoning and bombing

    Web 2.0 applications consume XML blocks coming from AJAX clients.

    It is possible to poison this XML block which represents XML-RPC or

    SOAP calls. A common technique is to apply recursive payloads to sim-

    ilar-producing XML nodes multiple times. At the application layer, little

    code or libraries would be handling this incoming XML block, and if it

    processes poorly, then it may result in a Denial of Services on the server.

    Many attackers also produce malformed XML documents that can dis-

    rupt logic depending on parsing mechanisms in use on the server. There

    are two types of parsing mechanisms available on the server side SAXand DOM. This same attack vector is also used with Web services, since

    they consume SOAP messages, and SOAP messages are nothing but

    XML messages. Large-scale adaptation of XMLs at the application layer

    opens up new opportunities to use this new attack vector.

    In the example below, nodes nested in such a way that SAX parsing

    will break.

    289001

    Rob2890

    01

    Rob

    Smith

    Apt #21, 1st Street

    Similarly, if the application is doing DOM based parsing then following

    the XML stream will put it into a big loop while loading for the process-

    ing and may cause Denial of Services.

    289001

    Rob

    Rob

    ... 100 time

    Rob

    Smith

    Apt 31, 1st Street

    [email protected]

    3809922347

    Above methods can poison the XML streams.

    Another way of poisoning or bombarding XML streams is by injecting

    multiple nodes into DTD. DTD loading may end up consuming a lot of

    memory and may consequently cause application layer Denial of Service

    as well. Here is an example of XML bombing by expanding entities:

    CDATA #IMPLIED>

    ...

    ]>

    XIP then needs to add protection against these vectors before it hits

    the Web 2.0 application layer. If these vectors get injected into the

    Web 2.0 application and the application is vulnerable, then they may

    cause potential harm to the application. Checking the size of the XML

    stream or node structure at the XML gateway can help protect the

    application. A good set of rules at XIP can help protect against this at-

    tack vector.

    4.4 In transit routing and revelation

    Web 2.0 applications are using Web Services extensively and WS-

    Routing protocol empowers a SOAP message to traverse through

    complex environments. The Paths for transmission can be defined in

    the header section of the SOAP protocol as per specification. So SOAP

    tunneling is not point to point, but can have multiple routes. There is

    a way to define envelope traversal from start to end point. If any of

    these intermediate targets are compromised then there is a significant

    risk to information. Manipulating an envelope in transit is also possible.

    For example, here is a simple SOAP message with intermediate nodes

    for transmission:

    http://bluetest/test

  • 8/3/2019 Intel XMLThreat Whitepaper

    27/33

    27

    White Paper: XML Intrusion Prevention

    soap://bluerecv/end

    soap://A.com

    soap://B.com

    uuid:xxxx-xxxx-xxxxx

    ...

    As shown in the snippet above, the header information of the SOAP

    message contains both forward and reverse routes for messages. Any

    flaw in routing can cause a ma