In-Memory Data Grid Applications: Finding Common Java Deserialization Vulnerabilities with CodeQL

This post is about Java deserialization vulnerabilities that I found in some major in-memory data grid applications. Although these are not recent, they demonstrate how the built-in queries on LGTM can help to identify vulnerabilities.

Data grid applications use the RAM of a cluster to store data. They are often used as a distributed data cache by conventional database systems, to improve performance and scalability. As you can imagine, these applications often make heavy use of serialization to transfer data.

In this post, I look at the following open source applications: Apache Geode, Red Hat Infinispan, Ignite, and Hazelcast. These applications all suffered from very similar types of deserialization vulnerabilities.

The two main issues are:

Deserialization of incoming requests to join the cluster.
Deserialization of data stored in the data grid.

Although none of these applications come with dependencies that contain known deserialization gadgets, they are quite often deployed with other database frameworks that depend on deserialization gadgets. Many will depend on commons-beanutils, for example, which can be chained with JDK classes to form gadgets. (For more information, see Moritz Bechler’s paper: Java Unmarshaller Security.)

I use Apache Geode (the open source version of Pivotal’s Gemfire) as an example to explain these issues in detail.

Apache Geode unsafe deserialization of cluster join request CVE-2017-15692

Data grid applications often operate in cluster mode: they listen out for requests sent from a new node and decide whether to allow the new node to join the cluster. The problem with all these applications is that different nodes in the cluster communicate through serialized Java objects. If an attacker can somehow send a packet to the cluster, the cluster first deserializes the packet, and then decides whether it is a legitimate request. These clusters usually operate on a secure local network, so while this type of attack is theoretically remote and pre-auth, in practice, it is likely to be local.

Let’s take a closer look at this issue in the case of Apache Geode. With recent improvements to the CodeQL data flow library, the standard “Deserialization of user-controlled data” query on LGTM now reports alerts for these issues is to allow users to define a whitelist and blacklist to filter out classes for deserialization, these are still flagged up by the standard LGTM query.

The second of these alerts has 10 separate entry points for user data. I wrote a proof-of-concept exploit for the fifth entry point, where the server processes a message sent to the cluster in TcpServer.java. In this case, the packet is received from a network socket. This then goes into the third argument of the processOneConnection method:

try {
  input = new DataInputStream(socket.getInputStream()); //<--- Untrusted source
} catch (StreamCorruptedException e) {
  ...
}
...
if (firstByte == CommunicationMode.ReservedForGossip.getModeNumber()) {
  processOneConnection(socket, startTime, input);    //<--- Next step

This argument then ends up in the DataSerializer.readObject method:

private void processOneConnection(Socket socket, long startTime, DataInputStream input)
  throws IOException, UnsupportedVersionException, ClassNotFoundException {
  ...
  input = new VersionedDataInputStream(input, Version.fromOrdinal(versionOrdinal, false));
  request = DataSerializer.readObject(input); //<--- Deserialize under the hood.

As the name suggests, the readObject method performs Java deserialization with input under the hood.

Apache Geode unsafe deserialization of application objects CVE-2017-15693

As I mentioned in the introduction, data grid applications use the RAM of the cluster as a data cache to reduce latency. This means that the Java objects stored in the cache either need to be transported from one machine to another, or need to be stored in some more persistent storage. How is this done? In Geode, data is fetched and written via the Region interface:

Region<String, String> region;
ClientCache cache = null;
// connect to the locator using default port 10334
cache = new ClientCacheFactory(props).setPoolSubscriptionEnabled(true)
      .addPoolLocator("localhost", 10334).create();
region = cache.<String, String>createClientRegionFactory(ClientRegionShortcut.CACHING_PROXY)
      .create(REGION1);
try {
  region.get(AUTHOR_LYNCH);
} finally {
  cache.close();
}

Let’s take a look at what the get method of Region calls. It is easy to find these cases with CodeQL:

from Method m, Method n
where m.polyCalls*(n) and
//get method of a cache
m.hasName("get") and m.getDeclaringType().getASourceSupertype*().getName().matches("Region%") and
//deserialization method
n.hasName("readObject") and n.getDeclaringType().hasQualifiedName("java.io", "ObjectInputStream") and m.getNumberOfParameters() = 1
select m, m.getDeclaringType(), n

As indicated from the results of the above query, it looks like Java deserialization is used to transport these objects. Let’s suppose that there is a malicious actor with access to write to a region in the data grid. This does not necessarily require a very high privilege level as the region may be a designated space for them to store their own data, and they may not be allowed to read or write to any other region. However, they can hide a malicious object in the data grid in the following way:

Region<String, Object> region;
ClientCache cache = null;
// connect to the locator using default port 10334
cache = new ClientCacheFactory(props).setPoolSubscriptionEnabled(true)
      .addPoolLocator("localhost", 10334).create();
region = cache.<String, Object>createClientRegionFactory(ClientRegionShortcut.CACHING_PROXY)
      .create(REGION1);
try {
  region.put(AUTHOR_LYNCH, obj); //<--- their favorite gadget goes here
} finally {
  cache.close();
}

What happens if a user with more privileges then comes along and fetches the object?

Region<String, String> region;
ClientCache cache = null;
// connect to the locator using default port 10334
cache = new ClientCacheFactory(props).setPoolSubscriptionEnabled(true)
      .addPoolLocator("localhost", 10334).create();
region = cache.<String, String>createClientRegionFactory(ClientRegionShortcut.CACHING_PROXY)
      .create(REGION1);
try {
  region.get(AUTHOR_LYNCH); //<--- deserialize?
} finally {
  cache.close();
}

It looks like the get method in Region doesn’t actually check the type before deserializing.

calculator

Disclosure timeline:

Oct 12, 2017: Private disclosure to vendor.
Feb 23, 2018: Public disclosure by vendor.

Infinispan unsafe deserialization of data grid object CVE-2017-15089

Red Hat’s Infinispan also suffers from the cluster joining issue, where an attacker can send a malicious packet to a cluster to be deserialized. Red Hat’s response to this is that, as most deployments of Infinispan are in a secure network, this issue is only exploitable when an attacker has access to this network. As communications between cluster nodes can also be configured to use encryption with a simple configuration file, they do not consider this an issue. I’d advise users of Infinispan (JBoss Data Grid) to follow the instructions in the link to encrypt communications between cluster nodes.

The issue with unsafe deserialization of data grid objects is an interesting one. It appears that the developers are aware of the issue and have attempted to add whitelisting validation when Java deserialization is performed. Unfortunately, Infinispan also uses the JBoss marshalling library for deserialization. What is even more unfortunate is that, when storing and fetching data grid objects, JBoss marshalling is used as the default serialization mechanism, with Java deserialization as a fallback:

public static <T> T bytes2obj(Marshaller marshaller, byte[] bytes, short status, List<String> whitelist) {
  if (bytes == null || bytes.length == 0) return null;
  try {
    Object ret = marshaller.objectFromByteBuffer(bytes);  //<--- uses JBoss marshalling under the hood
    if (HotRodConstants.hasCompatibility(status)) {
      if (ret instanceof byte[] && isJavaSerialized((byte[]) ret)) {
        T ois = tryJavaDeserialize(bytes, (byte[]) ret, whitelist); //<--- Java deserialization with whitelist

The documentation of JBoss marshalling says it is fully compatible with JDK serialization and I’ve checked that all the deserialization gadgets in ysoserial are indeed ‘compatible’ with JBoss marshalling. So again, a malicious actor can smuggle some objects into the data grid:

ConfigurationBuilder builder = new ConfigurationBuilder();
builder.addServer().host("127.0.0.1").port(ConfigurationProperties.DEFAULT_HOTROD_PORT);
RemoteCacheManager cacheManager = new RemoteCacheManager(builder.build());
RemoteCache<String, Object> cache = cacheManager.getCache();
cache.put("key", obj); //obj = favorite gadget

I tested this and then reported it to Red Hat.

Disclosure timeline:

Oct 16, 2017: Private disclosure to vendor.
Feb 12, 2018: Public disclosure by vendor.

Hazelcast

Hazelcast also suffers from both of these vulnerabilities. When I got round to auditing Hazelcast, I discovered that someone had already opened a public GitHub issue about the cluster joining vulnerability in April 2016.

I also sent the details and my proof-of-concept exploit (for both of the vulnerabilities) to the vendor sometime in December 2017, but only received an initial reply. The issue was fixed in May 2018. The fix should apply to both the cluster joining vulnerability and object fetching vulnerability.

Apache Ignite

Apache Ignite also suffers from these vulnerabilites. These are identified with the default query and are fixed by adding class filters to the corresponding ObjectInputStream. These are assigned CVE-2018-1295 and CVE-2018-8018 respectively.

Disclosure timeline (CVE-2018-1295):

Oct 5, 2017: Private disclosure to vendor.
Apr 2, 2018: Public disclosure by vendor.

Disclosure timeline (CVE-2018-8018):

May 22, 2018: Private disclosure to vendor.
Jul 19, 2018: Public disclosure by vendor.

Conclusions

In this post I examined three popular data grid applications and found that they all suffered from unsafe deserialization issues. This is a great reminder of why users should make sure that they keep these applications up-to-date.