Spring Data REST exploit (CVE-2017-8046): Finding a RCE vulnerability with CodeQL

In this post I take a look at a remote code execution vulnerability in Spring Data REST and I show how CodeQL can help making sure that a patch is properly implemented.

Details of the vulnerability

This vulnerability (also known as CVE-2017-8046) enables an attacker to send a PATCH request with maliciously crafted JSON data to run arbitrary code on the server, and is unfortunately very easy to exploit. It affects the Spring Data REST component, which is distributed as part of various other Spring projects, including the Spring Boot framework.

I discovered this vulnerability back in September 2017. Pivotal responded very swiftly to the report and, with our help, released a patch for affected components in October 2017. At Pivotal’s request and due to the critical nature of this vulnerability, we postponed the publication of this vulnerability until March 2018. A full timeline of the disclosure can be found at the bottom of this post.

Spring Expression Language (SpEL) vulnerabilities

The Spring expression language (SpEL) is a powerful expression language that is used in the spring framework. Typically, a string gets parsed into an Expression using a SpelExpressionParser, which then gets executed when one of the methods getValue, getValueType or setValue is called:

Expression expr = expressionParser.parseExpression(stringScript);
expr.getValue(); //Execute the code in stringScript

Often, the use of SpEL is confined to internal usage and stringScript is in complete control of the programmer. However, as SpEL is a very powerful language, if the input to parseExpression comes from an untrusted source, then the attacker will be able to execute arbitrary code on the vulnerable application. For example, if stringScript in the above is as follows:

String stringScript = "T(java.lang.Runtime).getRuntime().exec(" + cmd + ").x";

then cmd will be executed.

This is what happened in the case of Spring Data REST, where a remote user input ended parsed into a SpEL expression and got executed. To find the vulnerability using CodeQL, I first model the classes and method accesses that are used in the expression parser:

/** The class `ExpressionParser` in spring. */
class ExpressionParser extends RefType {
  ExpressionParser() {
    this.hasQualifiedName("org.springframework.expression", "ExpressionParser")
  }
}

/** Methods to parse an expression. */
class ParseExpression extends MethodAccess {
  ParseExpression() {
    exists (Method m |
      (m.getName().matches("parse%") or m.hasName("doParseExpression"))
      and
      this.getMethod() = m
    )
  }
}

This will identify the calls to the various methods that begin with parse. As some of these methods are declared in ExpressionParser while some others are declared in SpelExpressionParser only, I won’t specify the declaring type of the methods here, but rather restrict the declaring type in the results of the query.

To make our results more precise, I wanted to look for cases where these expressions were parsed in a context where the input was hostile. One approach is to track the remote inputs using the dataflow library as we did before. However, as I am not too familiar with the internal workings of Spring Data REST to model the remote input method, I decided to do some exploratory work and look for hints in the Javadoc. Often in classes that takes remote input in a web application, a path or URL will be specified in the Javadoc or in an annotation. In the QL language, I can access the Javadoc with the Javadoc class, so let’s look for Javadoc that contains a path or a URL. The following class looks at the Javadoc of a class and performs some basic string analysis to check whether it contains an expression that looks like a file path or a URL.

/** Look for hints in the javadoc that indicates this method is use for processing path. */
class DocHasPath extends Javadoc {
  DocHasPath() {
    exists(string s |
      //Look for something that indicates the functionality is to do with a path.
      s = this.toString().regexpMatch("(.* )?/[^ ]*/.*|.*path.*")
    )
  }
}

I then used the above to find methods declared in classes whose Javadoc may contain a path.

/** Look for methods that may be used for processing paths, indicated by javadoc. */
class CallHasPath extends Callable {
  CallHasPath() {
    //exclude test classes
    not this.getDeclaringType() instanceof TestClass and
    (
      this.getDoc().getJavadoc() instanceof DocHasPath or
      this.getDeclaringType().getDoc().getJavadoc() instanceof DocHasPath
    )
  }
}

In particular, I was interested in methods that are used for parsing strings into SpEL expressions, so putting these together, I found two results. These results appear in the same class, and the Javadoc suggests that the class is used to convert patch paths to SpEL expressions. Further investigation shows that the path variable being parsed in a class calls PatchOperation. Subclasses of PatchOperation are used to process PATCH requests. These classes receive an input from a PATCH request, which they then store as a field member path and parse into a SpEl Expression.

After testing that this is exploitable, I reported the issue to Pivotal, and the team responded quickly by adding a verifyPath method in PatchOperation to sanitize the input. The idea is that before the expression is evaluated, verifyPath is called to make sure that the remote input is properly sanitized:

verifyPath(entityType); //verify the `path` field that was parsed into a SpEL expression (`spelExpression`)
return evaluate(spelExpression.getValueType(targetObject)); //OK to call `getValueType` for evaluation.

Using CodeQL to detect incomplete fixes

It is not uncommon to see related security issues due to the original problem not being fixed completely. For example, in Spring-Webflow, there are two similar vulnerabilities CVE-2017-4971 and CVE-2017-8039, which are the same issue exploited through different code paths. Incomplete fixes of security issues can lead to serious problems as the security announcement would have given hackers a hint of where the issue is, so it is important to make sure that a patch covers all different variances of the vulnerability before making an announcement. To see whether the fix is complete, let’s check whether all code paths that call getValue, getValueType or setValue are called after verifyPath is called. First let me define a class that models the verifyPath method:

/** The sanitization method `verifyPath` */
class VerifyPath extends Method {
  VerifyPath() {
    this.hasName("verifyPath") and
    this.getDeclaringType().hasQualifiedName("org.springframework.data.rest.webmvc.json.patch", "PatchOperation")
  }
}

To determine whether a SpEL expression is always evaluated after verifyPath is called, I made use of the ControlFlow CodeQL library. First I define an ActionConfiguration, which models the calls to verifyPath. I use the isAction predicate to indicate that I want to look at execution paths that go through a call to verifyPath.

/** A control flow node that represents a call to `verifyPath`. */
class VerifyPathActionConf extends ActionConfiguration {
  VerifyPathActionConf() { this = "VerifyPathActionConf" }
  override predicate isAction(ControlFlowNode node) {
    node.(MethodAccess).getMethod() instanceof VerifyPath
  }
}

ActionConfiguration contains a predicate named callAlwaysPerformsAction, which checks whether its input (a Call) leads to the specified action (in this case, a call to verifyPath) being performed on every code path.

from MethodAccess ma, VerifyPathActionConf conf
where conf.callAlwaysPerformsAction(ma)
select ma

This identifies functions that always call verifyPath, for example:

public void callsVerify() {
  verifyPath();
}

and leaves out the ones that may not call verifyPath on some execution path, for example:

public void notAlwaysCallsVerify(boolean call) {
  if (call) {
    verifyPath();
  }
}

Using VerifyPathActionConf, I can now look for calls that evaluate an SpEL expression without first calling verifyPath. More specifically, I want to find out which PATCH requests are still vulnerable. As PATCH requests are carried out via the perform function of the PatchOperation class, I want to flag up an operation that may still be vulnerable. For example

public void perform(Object target, Class<T> type) {
  spelExpression.getValue()
}

which evaluates spelExpression directly without first calling verifyPath. I also need to identify cases where spelExpression is evaluated transitively, e.g.

public void perform(Object target, Class<T> type) {
  evaluateWithoutVerify();  //Evaluates spelExpression without first calling verifyPath
}

while leaving out the ones that are safe.

public void perform(Object target, Class<T> type) {
  verifyPath(type);
  evaluateWithoutVerify();
}

or

public void perform(Object target, Class<T> type) {
  evaluateWithVerify(); //calls verifyPath before evaluation.
}

Unlike local control flow analysis which struggles with all these possibilities and their transitive versions, CodeQL handles these cases with ease. Summarizing the above cases, an unsafe evaluation call should satisfy these conditions:

If it calls an evaluation method directly without first calling verifyPath
If it calls another unsafe evaluation call without first calling verifyPath

We see that an unsafe evaluation call is defined recursively. So to construct a QL class that models it, we use recursion:

/** A method that evaluates the expression before calling `verifyPath`.*/
class UnsafeEvaluateCall extends MethodAccess {
  UnsafeEvaluateCall() {
    (
      //Base of the recursion.
      this.getMethod() instanceof Evaluate
      or
      //recursive definition: This calls another `UnsafeEvaulateCall`
      exists(UnsafeEvaluateCall unsafe |
        this.getMethod() = unsafe.getEnclosingCallable()
      )
    )
    and
    //Does not always call verify before this.
    not exists(VerifyPathCallerAccess verify |
      dominates(verify, this)
    )
  }
}

Using this in a query, I found that the copy, move and remove operations are still vulnerable. After testing that the exploit still works with these operations, I reported the issue to Pivotal. Pivotal again responded quickly to make sure that all operations are protected by verifyPath. Extra unit tests were added to check the fix.

Query, don’t guess

While this seems to have fixed the issue completely with verifyPath applied to all operations, I decided to check whether verifyPath itself was implemented correctly. While checking the exact implementation of a sanitization algorithm is still best left to human, as we have seen, CodeQL can be extremely useful in identifying any logical error in code. In particular, the programmer and I seem to have assumed that path in PatchOperation was the only variable that gets parsed into a spelExpression, so I wanted to know if this really is the case. I wrote another query with DataFlow that looks for any expression that flows into an argument of a parse method:

from FlowSource source, Expr sink
where source.flowsTo(sink)
  and exists(ParseExpression m | sink = m.getAnArgument() and 
    m.getQualifier().getType().(RefType).getASupertype*() instanceof ExpressionParser and
    m.getEnclosingCallable().getDeclaringType() instanceof PathToSpEL)
  //Exclude the field `path` that have been checked by `verifyPath` already.
  and source.getEnclosingCallable().getDeclaringType().getASupertype*() instanceof PatchOperation
  and not exists(PathField path, FlowSource commonSource |
                 commonSource.flowsTo(path.getAnAccess()) and commonSource.flowsTo(source)
                 )
select source, sink

This query gave me 3 results. So it looks like that in the copy and move operations, another variable called from is also parsed into a SpEL expression. This makes sense as we expect copy and move to take two paths, one representing the source and the other, the destination. However, when we just look at PatchOperation, it is easy to think that all operations take only one path and miss out other cases, especially when a codebase is extended over time with new operations added by different people. With CodeQL, instead of guessing, I can perform a thorough search and get an answer within a couple of minutes. After testing that the above operations are still exploitable, I reported my findings to Pivotal again. This time they were confident that their new unit tests had caught all the issues and it took slightly longer for them to respond. Looking at the tests, it is clear that the from variable was not tested. Even if it was the case, the exploit will now require path to be a legitimate path, which is also not very easy to catch via either unit tests or standard fuzzing.

In the end, the entire patch expression handling code was rewritten. This time, all SpEL expressions are parsed inside a new class called SpelPath with a subclass TypedSpelPath that calls verifyPath during construction. The natural question is whether any evaluation method is called outside of TypedSpelPath. To do so, I wrote a simple query that looks for functions that call the evaluation methods.

from Method m
where exists(Method e | e instanceof Evaluate and m.polyCalls(e))
select m, m.getDeclaringType()

Running the query, I discovered that all evaluation calls were indeed within TypedSpelPath, so it looks like the issue is finally fixed.

Disclosure timeline and vendor response

7 Sept 2017: Initial private disclosure to Pivotal security team
21 Sept 2017: Announcement made by Pivotal
22 Sept 2017: Verified patch by Pivotal, reported it was incomplete (with updated exploit)
26 Sept 2017: Response from Pivotal about a new fix
27 Sept 2017: Verified patch by Pivotal, reported it was incomplete (with updated exploit)
25 Oct 2017: Final fix committed
1 Mar 2018: Announcement in two blog posts on our blog. Due to the critical nature of the vulnerability, at Pivotal’s request, we postponed the publication of this vulnerability until March 2018.

Conclusion

In this post we have seen how CodeQL can help identify similar vulnerabilities within a codebase and to ensure that a vulnerability fix is complete. This type of variant analysis has been hugely successful with many customers, and I hope it will be useful to you too. As a further example, another query would identify both CVE-2017-4971 and CVE-2017-8039 in Spring-Webflow. These vulnerabilities, that I mentioned earlier in this article, are also caused by unsafe parsing of SpEL expressions.

Note: Post originally published on LGTM.com on March 01, 2018