CVE-2018-11776: How to find 5 RCEs in Apache Struts with CodeQL

In April 2018, I reported a new remote code execution vulnerability in Apache Struts and to the Struts Security Team. The vulnerability has been assigned CVE-2018-11776 (S2-057), is exposed on servers running Struts under certain configurations and can be triggered by visiting a specially crafted URL.

This discovery is part of my continuing security research into Apache Struts. In this post I’ll walk through the process that led me to find the vulnerability. I’ll explain how I used information on previous vulnerabilities to gain knowledge of the internal working of Struts and to create queries that encapsulate Struts-specific concepts. Running these queries produced results that highlighted the problematic code. These queries are hosted on GitHub, and over time, as research continues, we’ll be adding more queries and libraries to this repository to aid security research for Struts and other projects.

Mapping the attack surface

Many security vulnerabilities involve data flowing from an untrusted source—for example, user input—to some specific location (a sink) where the data gets used in a way that is dangerous—for example, an SQL query, deserialization, some other interpreted language, etc… CodeQL makes it easy to search for this type of vulnerability. You simply need to describe the various sources and sinks, and then let the DataFlow library do all the hard work. A good way to start an investigation into this kind of problem, for a particular project, is to look at known vulnerabilities for older versions of the software. This can give you good insight into the sorts of sources and sinks you want to look for.

For this investigation, I started by looking at the RCE vulnerabilities S2-032 (CVE-2016-3081), S2-033 (CVE-2016-3687) and S2-037 (CVE-2016-4438). Like many other RCEs in Struts, these involve untrusted input being evaluated as an OGNL expression, which allows the attacker to run arbitrary code on the server. These three vulnerabilities are particularly interesting, not only because they give us some insights into the internal working of Struts, but also because what is effectively the same issue took three attempts to fix!

All three issues were the result of remote inputs being passed via the variable methodName as an argument to the method OgnlUtil::getValue().

String methodName = proxy.getMethod();    //<--- untrusted source, but where from?
LOG.debug("Executing action method = {}", methodName);
String timerKey = "invokeAction: " + proxy.getActionName();
try {
    UtilTimerStack.push(timerKey);
    Object methodResult;
    try {
        methodResult = ognlUtil.getValue(methodName + "()", getStack().getContext(), action); //<--- RCE

The field proxy has the type ActionProxy, which is an interface. Looking at its definition, I see that in addition to the method getMethod() (which is used in the code above in the assignment of the tainted variable methodName) there are also various methods like getActionName() and getNamespace(). These methods looked like they may return information from the URL. So I began with the assumption that all of these methods could potentially return untrusted input. (In later posts, I will dig deeper into my investigation of where these inputs come from.)

Now we can start modeling these untrusted sources using CodeQL’s language, named QL:

class ActionProxyGetMethod extends Method {
  ActionProxyGetMethod() {
    getDeclaringType().getASupertype*().hasQualifiedName("com.opensymphony.xwork2", "ActionProxy") and
    (
      hasName("getMethod") or
      hasName("getNamespace") or
      hasName("getActionName")
    )
  }
}

predicate isActionProxySource(DataFlow::Node source) {
   source.asExpr().(MethodAccess).getMethod() instanceof ActionProxyGetMethod
}

Identifying the OGNL sinks

Now that we’ve identified and described some untrusted sources, the next step is to do the same for the sinks. As mentioned before, many Struts RCEs involve remote input being parsed as OGNL expressions. There are many functions in Struts that ultimately evaluate their arguments as OGNL expressions under the hood; in the case of the three vulnerabilities we started with in this post, OgnlUtil::getValue() was used, however in the vulnerability S2-045 (CVE-2017-5638), TextParseUtil::translateVariables() was used. Instead of describing each of these methods as separate sinks, we can look for common functions that they use to execute OGNL expressions. I decided that OgnlUtil::compileAndExecute() and OgnlUtl::compileAndExecuteMethod() look like promising sinks.

I described them both in a single QL predicate like so:

predicate isOgnlSink(DataFlow::Node sink) {
  exists(MethodAccess ma | ma.getMethod().hasName("compileAndExecute") or ma.getMethod().hasName("compileAndExecuteMethod") |
    ma.getMethod().getDeclaringType().getName().matches("OgnlUtil") and
    sink.asExpr() = ma.getArgument(0)
  )
}

First attempt at taint tracking

Now that we have our sources and sinks defined with CodeQL, we can use these definitions in a taint-tracking query. We use the DataFlow library to do this, by defining a DataFlow Configuration:

class OgnlTaintTrackingCfg extends DataFlow::Configuration {
  OgnlTaintTrackingCfg() {
    this = "mapping"
  }

  override predicate isSource(DataFlow::Node source) {
    isActionProxySource(source)
  }

  override predicate isSink(DataFlow::Node sink) {
    isOgnlSink(sink)
  }

  override predicate isAdditionalFlowStep(DataFlow::Node node1, DataFlow::Node node2) {
    TaintTracking::localTaintStep(node1, node2) or
    exists(Field f, RefType t | node1.asExpr() = f.getAnAssignedValue() and node2.asExpr() = f.getAnAccess() and
      node1.asExpr().getEnclosingCallable().getDeclaringType() = t and
      node2.asExpr().getEnclosingCallable().getDeclaringType() = t
    )
  }
}

from OgnlTaintTrackingCfg cfg, DataFlow::Node source, DataFlow::Node sink
where cfg.hasFlow(source, sink)
select source, sink

Here I use the previously defined isActionProxySource and isOgnlSink predicates.

Note that I also override a predicate called isAdditionalFlowStep. This predicate allows me to include extra steps where tainted data can be propagated. This allows me, for example, to incorporate project-specific information in the flow configuration. For example, if I had components that communicated via some network layer, I could describe with CodeQL what the code of those various network endpoints looked like, allowing the DataFlow library to track data being tainted through that abstraction.

For this particular query, I added two extra flow steps for the DataFlow library to use. The first one:

TaintTracking::localTaintStep(node1, node2)

includes the standard CodeQL TaintTracking library steps that track through standard Java library calls, string operations etc. The second addition is an approximation that allows me to track taint data via field accesses:

exists(Field f, RefType t | node1.asExpr() = f.getAnAssignedValue() and node2.asExpr() = f.getAnAccess() and
  node1.asExpr().getEnclosingCallable().getDeclaringType() = t and
  node2.asExpr().getEnclosingCallable().getDeclaringType() = t
)

This says that if a field is assigned to some tainted value, then an access to that field will also be considered tainted, as long as both expressions are called by methods of the same type. Roughly speaking, this incorporates the following case:

public void foo(String taint) {
  this.field = taint;
}

public void bar() {
  String x = this.field; //x is tainted because field is assigned to tainted value in `foo`
}

As you can see, the access of this.field in bar() may not always be tainted. For example if foo() is not called before bar(). Because of this, we don’t include this flow step in the default DataFlow::Configuration, as we cannot guarantee that data will always flow in this manner. For hunting vulnerabilities however, I find this addition to be useful and often include them in my DataFlow::Configuration. In later posts, I’ll share some other flow steps that are like this one, and which are useful for bug hunting but are not included by default for similar reasons.

Once I ran the query on the latest version of the source code, and started looking through the results, I noticed that the causes of the issues S2-032, S2-033 and S2-037 were still being flagged by the query. Before looking into the other results it found, I wanted to investigate why these particular ones were still being flagged even though the code was fixed.

It turns our that while the first vulnerabilities were initially fixed by sanitizing the inputs, after S2-037 the Struts team decided to fix it by replacing a call to OgnlUtil::getValue() with a call to OgnlUtil::callMethod().

methodResult = ognlUtil.callMethod(methodName + "()", getStack().getContext(), action);

The method callMethod() wraps a call to compileAndExecuteMethod():

public Object callMethod(final String name, final Map<String, Object> context, final Object root) throws OgnlException {
  return compileAndExecuteMethod(name, context, new OgnlTask<Object>() {
    public Object execute(Object tree) throws OgnlException {
      return Ognl.getValue(tree, context, root);
    }
  });
}

And compileAndExecuteMethod() performs an additional check for the expression before executing it:

private <T> Object compileAndExecuteMethod(String expression, Map<String, Object> context, OgnlTask<T> task) throws OgnlException {
  Object tree;
  if (enableExpressionCache) {
    tree = expressions.get(expression);
    if (tree == null) {
      tree = Ognl.parseExpression(expression);
      checkSimpleMethod(tree, context); //<--- Additional check.
    }

That means we can actually remove compileAndExecuteMethod() from our sink.

You can see that after I re-ran the query, the result that highlighted the call to getMethod() as a sink disappeared. However, there were still a couple of results that highlighted code in DefaultActionInvocation.java that was supposedly fixed, such as this call to getActionName(), and it wasn’t immediately obvious what the data path was from here to a compileAndExecute() sink.

To investigate why this result was being flagged, I needed to be able to see each of the individual flow steps that the DataFlow library used to produce this result. CodeQL allows you to write special path-problem queries that produce variable length paths that can be explored node by node, and the DataFlow library allows you to write queries that output this data.

At the time of writing this blog post, LGTM itself does not have a path exploration UI for path-problem queries, so I needed to use CodeQL for Eclipse. This is an Eclipse plugin that includes a visualization tool exactly for this purpose, allowing you to go through the individual steps in the taint tracking. It not only allows offline analysis of open source projects on GitHub, but also gives you a much more powerful development environment. The following queries can be found in the Semmle/SecurityQueries Git repository under the semmle-security-java directory. You can follow the instructions in the README.md files to run them in the Eclipse plugin. From here on I’ll include screenshots from CodeQL for Eclipse.

[EDIT]: Path exploration is now available on LGTM. You can also use our free CodeQL extension for Visual Studio Code. See installation instructions at https://securitylab.github.com/tools/codeql/.

First, run the query in initial.ql. In CodeQL for Eclipse, upon selecting the result from DefaultActionInvocation.java, you can see a detailed path from source to sink in the Path Explorer window.

chainedTo

In the image above, you can see that after a few steps, the value returned by the call to getActionName() flows into an argument in a call to get() on the object returned by pkg.getActionConfigs():

String chainedTo = actionName + nameSeparator + resultCode; //actionName comes from `getActionName` somewhere
ActionConfig chainedToConfig = pkg.getActionConfigs().get(chainedTo); //chainedTo contains `actionName` and ended up in the `get` method.

Clicking on the next step, key, takes me to the method ValueStackShadowMap::get(), which looks like:

public Object get(Object key) {
  Object value = super.get(key);  //<--- key gets tainted?

  if ((value == null) && key instanceof String) {
    value = valueStack.findValue((String) key);  //<--- findValue ended up evaluating `key`
  }

  return value;
}

It turns out that because pkg.getActionConfigs() returns a Map, and ValueStackShadowMap implements the Map interface, it is, in theory, possible that the value returned by pkg.getActionConfigs() is an instance of ValueStackShadowMap. As a result, the CodeQL DataFlow library shows this potential flow from the variable chainedTo to the implementation of get() in the class ValueStackShadowMap. In practice, the class ValueStackShadowMap belongs to the jasperreports plugin and instances of that class are only created in a couple of places, none of which are returned by pkg.getActionConfigs(). After looking at the issue and convincing myself that ValueStackShadowMap::get() is unlikely to be hit, I removed results that rely on it by adding a barrier in the DataFlow::Configuration:

override predicate isBarrier(DataFlow::Node node) {
  exists(Method m | (m.hasName("get") or m.hasName("containsKey")) and
    m.getDeclaringType().hasName("ValueStackShadowMap") and
    node.getEnclosingCallable() = m
  )
}

This predicate says that if the taint data flows into the get() or containsKey() methods of ValueStackShadowMap, then do not carry on tracking it. (I added the containsKey() method here as it suffers from the same problem.)

After adding a further barrier for ActionMapping::toString() (which was causing problems whenever toString() was called on arbitrary objects), I re-ran the query which left us with only a handful of results. You can also try the one for the Eclipse plugin to visualize the taint path.

The new vulnerabilities

With only 10 pairs of sources and sinks, it’s easy enough to go through by hand and check to see if these are genuine issues. Going through some of the paths, I saw that some are invalid as they were inside test cases etc., so I added a few more barriers in the query to filter out these paths too. This left some really interesting results.

Take the source in ServletActionRedirectResult.java for example:

redirect1

In the first step, the source from the call to getNamespace() flows into the argument of the constructor of ActionMapping via the variable namespace:

public void execute(ActionInvocation invocation) throws Exception {
  actionName = conditionalParse(actionName, invocation);
  if (namespace == null) {
    namespace = invocation.getProxy().getNamespace();  //<--- source
  } else {
    namespace = conditionalParse(namespace, invocation);
  }
  if (method == null) {
    method = "";
  } else {
    method = conditionalParse(method, invocation);
  }

  String tmpLocation = actionMapper.getUriFromActionMapping(new ActionMapping(actionName, namespace, method, null)); //<--- namespace goes into constructor of ActionMapping

  setLocation(tmpLocation);

Following the steps further, I saw that getUriFromActionMapping() returns a URL string that uses namespace from the constructed ActionMapping. This then flows into the argument of setLocation() via the variable tmpLocation:

setLocation() then set the field location in the super class StrutsResultSupport:

public void setLocation(String location) {
    this.location = location;
}

The code then calls execute() on ServletActionResult:

String tmpLocation = actionMapper.getUriFromActionMapping(new ActionMapping(actionName, namespace, method, null));

setLocation(tmpLocation);

super.execute(invocation);

which passes the location field to a call to conditionalParse():

public void execute(ActionInvocation invocation) throws Exception {
    lastFinalLocation = conditionalParse(location, invocation);
    doExecute(lastFinalLocation, invocation);
}

conditionalParse() then passes location into translateVariables(), which evaluates param as an OGNL expression under the hood:

protected String conditionalParse(String param, ActionInvocation invocation) {
    if (parse && param != null && invocation != null) {
        return TextParseUtil.translateVariables(
            param,
            invocation.getStack(),
            new EncodingParsedValueEvaluator());
    } else {
        return param;
    }
}

So it looks like when the namespace parameter is not set in a ServletActionRedirectResult, the code takes the namespace from the ActionProxy, which then gets evaluated as an OGNL expression. To test this, I replaced the struts tag in one of the configuration files (struts-actionchaining.xml for example) in the showcase application by the following:

<struts>
    <package name="actionchaining" extends="struts-default">
        <action name="actionChain1" class="org.apache.struts2.showcase.actionchaining.ActionChain1">
           <result type="redirectAction">
             <param name = "actionName">register2</param>
           </result>
        </action>
    </package>
</struts>

I then ran the showcase application locally, and then visited a URL that was designed to trigger this vulnerability and execute a shell command to open the calculator application on my computer.

And it worked (after spending some time to bypass the OGNL sandbox). At this stage, I am holding off on giving further details, but I’ll release them in due course.

Not only that, but the untrusted sources from ActionChainResult, PostbackResult and ServletUrlRenderer worked as well! The one in PortletActionRedirectResult probably works as well, but I didn’t test it. Four RCEs is probably enough to demonstrate the severity of the issue.

Conclusions

In this post I’ve shown that by using known (past) vulnerabilities to help build a taint model of an application, new vulnerabilities can be found by just leaving the hard work to the CodeQL DataFlow library. In particular, by studying three previous RCEs in Struts, we ended up finding four (possibly five) more!

Given that S2-032, S2-033 and S2-037 were all discovered and fixed within a short time-frame, security researchers clearly studied S2-032 to look for similar issues and discovered S2-033 and S2-037. So the big question here is: given that the vulnerability I found here (S2-057) also comes from a similar tainted source, how was this missed by both the security researchers and the vendor and only discovered two years later? In my opinion, it’s because the similarities between S2-032, S2-033 and S2-037 are, in a sense, local, in that they all occur in a similar location in the source code (all in the Rest plugin). The similarity between S2-057 and S2-032 is at a much more semantic level. They are linked by the tainted source, rather than the location of the source code, so any software or tool that would successfully find variants like this would need to be able to perform this kind of semantic analysis across the whole code base, which as I demonstrate can now be done with CodeQL.

And if you think all this is more of a fluke or a Columbus’ egg, because I made the assumption that the namespace field in an ActionProxy is tainted out of the blue, then stay tuned for the next post, where I’ll go into more detail and do some taint tracking from “first principles,” starting from the incoming HttpRequestServlet itself. I’ll also share some tools from my “bug hunting toolbox,” and some general tips for refining queries. In doing so, we’ll also see how CodeQL could have been used to catch S2-045, the vulnerability that Equifax suffered from!