Android Deserialization Vulnerabilities: A Brief history

In this post I will review some deserialization vulnerabilities in Android, and show how one simple query can identify most of them.

Android permission model

To illustrate the impact of deserialization vulnerabilities in Android, I will first summarize the permission model in Android:

To minimize the damage from malicious apps and malware, every Android application runs in a sandbox as a separate Linux user with very limited privileges. However, in many situations, applications need extra privileges—either to access other applications or to perform one-off actions that they do not have the privileges to do themselves. For example, an application may need to handle a file download or access the location of the phone. To make this possible, Android allows applications to communicate via IPC so that they can request other applications to carry out actions on their behalf. As a second defensive layer, a user called the system user (id 1000) is created to carry out higher-privileged actions that third party apps cannot perform. Although powerful, the system user still has restricted privileges compared to root. Third party apps can only interact with the system user, instead of with root directly. In addition, the system user intermediates all communication between third party apps and the kernel.

As with all IPCs, parameters can be communicated by packaging them in some serialized format. As well as the standard Java serialization, Android also provides its own serialization mechanism called Parcel. Subclasses of Parcelable need to override the serialization method writeToParcel and implement a non-null static field called CREATOR for deserialization. Whether it is through Java serialization or through Android’s own Parcelable interface, the deserialization mechanism used in these IPCs can easily be tricked into deserializing unexpected objects. For example, when communicating via an Intent, an application may be able to put extra parameters in the Intent:

Intent intent = new Intent(Intent.ACTION_SEND);
intent.putExtra("EXPECTED", "someString"); //An expected parameter
intent.putExtra("UNEXPECTED", someObject); //Unexpected object

The receiving application here only expects a string parameter with the key EXPECTED. Within its code, it may fetch and consume this parameter:

Intent intent = getIntent();
String expected = intent.getStringExtra("EXPECTED");

As this application only receives a string parameter and the unexpected parameter is not consumed at all, it is reasonable for the developer to assume that no deserialization will take place. However, this is not true. An Intent stores these parameters as a [Bundle](https://developer.android.com/reference/android/os/Bundle.html object, which is a Map<String, Object> under the hood. When a value in the Bundle is accessed, it will call the readMapInternal method of Parcel to deserialize its map entries:

/* package */ void readMapInternal(Map outVal, int N, ClassLoader loader) {
  while (N > 0) {
    Object key = readValue(loader);
    Object value = readValue(loader);
    outVal.put(key, value);
    N--;
  }
}

This in turns calls readValue to determine the appropriate deserialization method to use. This means that, once getStringExtra of Intent is called, all parameters will be deserialized, including the unexpected ones. In particular, this means that a malicious app can smuggle any object into an Intent and force the receiving app to deserialize it. This can be used to attack any application or process that accepts an Intent, including those owned by the system user. Historically, this has not caused too many serious issues as very few deserialization gadgets have been found in the Android ecosystem. In addition, many of the more dangerous classes, such as the TemplatesImpl and JdbcRowSetImpl are not available in Android.

Attacking C++ from Java deserialization

As there is a fair amount of interaction between Java and C++ in Android, there are some Java objects that are simply wrappers of C++ objects. The Java wrapper object typically holds a long representation of a C++ pointer, calls C++ code to allocate the object, and then releases the memory in the finalize method.

One such class is BinderProxy. This class stores two long fields, mObject and mOrgue, which are integer representations of pointers to C++ objects. In its finalize method, a native method destroy is called to free up the memory allocated to these underlying C++ objects:

protected void finalize() throws Throwable {
  try {
    destroy();
  } finally {
    super.finalize();
  }
}

If an attacker can gain control of the pointers mObject or mOrgue, then they can cause a memory corruption when the finalize method is called. This was used to demonstrate a Android deserialization vulnerability in 2014 (see below).

CVE-2014-7911: Privilege Escalation using ObjectInputStream

This vulnerability was reported by Jann Horn, who discovered that the ObjectInputStream in Android did not verify whether the class encoded in the stream was indeed serializable before deserializing it. This meant that, by modifying the stream directly, an attacker could trick Android into deserializing any class, including the BinderProxy class. Since BinderProxy is not a serializable class to start with, the members mObject and mOrgue in this class are not transient. This meant that an attacker could control the deserialized values of these pointers by putting them into the stream. Jann Horn gave a very detailed PoC in which he attacked the UserManager service, a process run by the system user, via a malicious Intent. The PoC ended up triggering a reboot of the phone. Soon afterwards, Yaron Lavi and Nadav Markus used this vulnerability to execute arbitrary code as the system user.

Finding C++ proxy classes with CodeQL

Before moving on to the next vulnerability, let’s think about what classes are dangerous in these types of attacks. What I am looking for is a Java class that owns some C++ objects and is responsible for managing their native memory. These C++ objects are typically represented as a long pointer field in the class, and the Java class will need to somehow release the memory owned by these objects. Typically, this involves the finalize method of this class calling a native method that frees the memory. This method must then have access to the pointer fields, either directly by passing through an argument, or indirectly through access of the declared class. In summary, I want to look for a Java class that:

Implements finalize; and
The finalize method calls a native method such that:
- The native method is declared in a subclass that implements the finalize method so it has access to the pointer fields via the object itself, or
- The native method is called with the pointer field arguments, and these pointer fields are declared in a subclass that implements the finalize method. In this case, the native method and the finalize method do not necessarily have to be declared in the same class.

With that in mind, I can now write a predicate to identify these classes:

predicate finalizeNative(FinalizeMethod finalize, Method free, RefType finalizeType) {
  free.isNative() and // The free method should be native
  finalize.polyCalls(free) and // The finalize should call free
  finalize.getDeclaringType() = finalizeType and
  (
    // Either free is declared in a subclass of the finalize class, so that it can access the pointers
    free.getDeclaringType().getASupertype*() = finalizeType or
    // Or the pointers are passed to it as arguments.
    // The pointer fields should be declared in a subclass of the finalize class.
    exists(MethodAccess ma, Field ptr | ptr.getDeclaringType().getASupertype*() = finalizeType |
      ma.getMethod() = free and
      ptr.getAnAccess() = ma.getAnArgument() and
      ptr.getType().getName() = "long" // The pointer fields should be of type long
    )
  )
}

from FinalizeMethod finalize, Method free, RefType finalizeType
where finalizeNative(finalize, free, finalizeType)
select finalizeType.getQualifiedName()

A few lines of code and I’m ready to go! Running this query, I got about 73 results. BinderProxy is among them, as well as some other classes, including FileBackupHelperBase and OpenSSLX509CRL, which could also be exploited before CVE-2014-7911 was fixed.

Of course, after CVE-2014-7911 was fixed, it was no longer possible to use arbitrary classes to carry out the attack. I now restrict the query to only look for Serializable classes. Because there are fewer such classes, I increase my chances of finding something by also looking for transitive calls from finalize to free:

predicate finalizeNative(FinalizeMethod finalize, Method free, RefType finalizeType) {
  free.isNative() and // The free method should be native
  finalize.polyCalls*(free) and // Add transitive calls using `polyCalls*`
  ....
}

from FinalizeMethod finalize, Method free, RefType finalizeType
where finalizeType.getASupertype*().hasQualifiedName("java.io", "Serializable") and // Only look for Serializable
finalizeNative(finalize, free, finalizeType)
select finalizeType.getQualifiedName()

This gives me a single result: the class OpenSSLX509Certificate. This class has a field mContext that is freed by the NativeCrypto.X509_free function in the finalize method.

protected void finalize() throws Throwable {
  try {
    if (mContext != 0) {
      NativeCrypto.X509_free(mContext);
    }
  } finally {
    super.finalize();
  }
}

CVE-2015-3825: One class to rule them all

This vulnerability was discovered by Or Peles and Roee Hay, who published a detailed paper titled ‘One class to rule them all’. That one class was the OpenSSLX509Certificate class above. As can be seen from the source code, the mContext field in the OpenSSLX509Certificate class was not transient and could therefore be set during deserialization. As with the previous vulnerability, a malicious app could cause memory corruption by sending a specially crafted serialized OpenSSLX509Certificate to another app. From there, Or Peles and Roee Hay were able to execute arbitrary code as the system user. Furthermore, they looked at dependencies of popular Android apps and discovered some other Serializable classes—generated by the code generation tool SWIG—that also contained non-transient representations of C++ pointers. The issue has since been fixed by making the field mContext transient.

What about Parcelable?

So far I have not used any classes that implement Android’s own serialization mechanism: the Parcelable class. I will now modify the query to look for both Serializable and Parcelable classes. First I abstract away these two serialization mechanisms in a QL class:

class Serial extends RefType {
  Serial() {
    this.getASourceSupertype*().hasQualifiedName("java.io", "Serializable") or
    this.getASourceSupertype*().hasQualifiedName("android.os", "Parcelable")
  }
}

Using this class, I can now modify the query to only look for classes that are either Serializable or Parcelable.

from FinalizeMethod finalize, Method free, RefType finalizeType
where finalizeType.getASupertype*() instanceof Serial and // Only look for Serializable or Parcelable
finalizeNative(finalize, free, finalizeType)
select finalizeType.getQualifiedName()

This query finds 10 results, including the class MemoryIntArray, which is Parcelable.

CVE-2017-411 and CVE-2017-412: Ashmem race conditions in MemoryIntArray

The constructor that instantiates a MemoryIntArray is responsible for deserializing a Parcel into a MemoryIntArray.

private MemoryIntArray(Parcel parcel) throws IOException {
  mOwnerPid = parcel.readInt();
  mClientWritable = (parcel.readInt() == 1);
  ParcelFileDescriptor pfd = parcel.readParcelable(null);
  if (pfd == null) {
    throw new IOException("No backing file descriptor");
  }
  mFd = pfd.detachFd();
  final long memoryAddress = parcel.readLong();
  if (isOwner()) {
    mMemoryAddr = memoryAddress;
  } else {
    mMemoryAddr = nativeOpen(mFd, false, mClientWritable);
  }
}

From the source code, it seems that the field mMemoryAddr represents the address of some underlying native memory. If isOwner evaluates to true, then this address is controlled by the attacker. The method isOwner checks whether mOwnerPid is the same as the process id and mOwnerPid is again controlled by the attacker during deserialization:

private boolean isOwner() {
  return mOwnerPid == Process.myPid();
}

If the attacker can supply the process id of the process that it is trying to attack, then they can set the address of the memory. This is then released in the finalize method by calling close:

public void close() throws IOException {
  if (!isClosed()) {
    nativeClose(mFd, mMemoryAddr, isOwner());
    mFd = -1;
  }
}

The nativeClose method then calls munmap on the memory address:

static void android_util_MemoryIntArray_close(JNIEnv* env, jobject clazz, jint fd,
    jlong ashmemAddr, jboolean owner)
{
    ....
    int unmapResult = munmap(reinterpret_cast<void *>(ashmemAddr), ashmemSize);
    ....
}

This vulnerability was reported by project zero. The exploit submitted by project zero didn’t actually make use of this, but instead used a race condition to take control of the variable ashmemSize, which then caused out-of-bounds access.

Reducing false positives

While the query is reasonably precise with few false positives, I noticed that it often returns results of the following form:

mPtr = nativeAllocate();

In this case, the pointer field is initialized by calling a native method to allocate the memory and so it is not really under my control. To eliminate these cases, I can restrict to the cases where the pointer field is initialized from a value read from the stream. For Java Serializable classes, I just need to make sure that the pointer is not transient or static:

/** A field controlled by deserialization. */
class ControlledField extends Field {
  ControlledField() {
    (not this.isTransient() and not this.isStatic() and
    this.getDeclaringType().getASupertype*().hasQualifiedName("java.io", "Serializable"))
  }
}

For Parcelable, I can check that there exists an assignment of the pointer field via the readLong method of Parcel. This requirement is more strict than necessary and I might miss cases where the assignment is indirect, but to keep things simple, I will use this approximation as an example.

/** A method access that reads an int or long value from a `Parcel`.*/
class ParcelAssignment extends MethodAccess {
  ParcelAssignment() {
    exists(Method m | this.getMethod() = m and
      m.hasName("readLong") and
      m.getDeclaringType().hasQualifiedName("android.os", "Parcel")
    )
  }
}

/** A field controlled by deserialization. */
class ControlledField extends Field {
  ControlledField() {
    // For Serializable, I want the field to be non transient and non static
    (not this.isTransient() and not this.isStatic() and
    this.getDeclaringType().getASupertype*().hasQualifiedName("java.io", "Serializable")) or
    // For Parcelable, I want it to be assigned via readLong
    exists(ParcelAssignment pa | this.getAnAssignedValue() = pa)
  }
}

predicate finalizeNative(FinalizeMethod finalize, Method free, RefType finalizeType) {
  free.isNative() and
  finalize.polyCalls(free) and
  finalize.getDeclaringType() = finalizeType and
  (
    free.getDeclaringType().getASupertype*() = finalizeType or
    // Now the pointer field is a ControlledField
    exists(MethodAccess ma, ControlledField ptr | ptr.getDeclaringType().getASupertype*() = finalizeType |
      ma.getMethod() = free and
      ptr.getAnAccess() = ma.getAnArgument() and
      ptr.getType().getName() = "long" // The pointer fields should be of type long
    )
  )
}

This query now only finds cases where I have control of the pointer field. Running it on the latest snapshot of AOSP, I find no results.

What about Java-only gadgets?

So far these exploits involve an attacker using Java deserialization to cause memory corruption via C++ code. What about deserialization gadgets that use Java classes only? As I explained before, there are currently no known Java-only gadgets that exist in the Android SDK, and many of the more dangerous classes are not implemented. A simple case to check is whether there are any deserialization methods that call arbitrary class constructors directly, for example:

private void readObject(ObjectInputStream in) throws IOException, ClassNotFoundException {
  ois.defaultReadObject();
  transientField = (FieldType)(Class.forName(nonTransientString).newInstance());
  // nonTransientString gets populated after deserialization.
}

The code pattern I am looking for here is:

The Class.forName method is called within a deserialization method.
The argument of Class.forName can be controlled. At a minimum, this means it should not be a constant.

I start with a QL class that models deserialization methods in Android. As well as the readObject method used by Serializable, there is also a createFromParcel method used by Parcelable that is declared in the Creator class.

class ParcelableDeserializeMethod extends Method {
  ParcelableDeserializeMethod() {
    // Parcelable deserialize method.
    (this.getName() = "createFromParcel" and this.getDeclaringType().getASourceSupertype*().hasName("Creator")) or
    // Serializable readObject
    (this.getName() = "readObject" and this.isPrivate() and
     this.getDeclaringType().getASourceSupertype*().hasQualifiedName("java.io", "Serializable")
    )
  }
}

I can now write a query to look for classes that call Class.forName in their deserialization method:

class ForName extends Method {
  ForName() {
    this.hasName("forName") and this.isStatic() and
    this.getDeclaringType().getSourceDeclaration().hasQualifiedName("java.lang", "Class")
  }
}

from MethodAccess ma
// Access to Class.forName
where ma.getMethod() instanceof ForName and
// The class type is not a constant
not ma.getArgument(0) instanceof CompileTimeConstantExpr and
// Called by a deserialization method.
exists(ParcelableDeserializeMethod m | m.polyCalls(ma.getEnclosingCallable()))
select ma, ma.getLocation()

This query gives me three results: two are from the URL class where I don’t have much control over the class type, but the third one is the ParcelableException class:

public static Throwable readFromParcel(Parcel in) {
  final String name = in.readString();
  final String msg = in.readString();
  try {
    final Class<?> clazz = Class.forName(name, true, Parcelable.class.getClassLoader());
    return (Throwable) clazz.getConstructor(String.class).newInstance(msg);
    ...
}

From the source code, it is clear that both the class type and the argument of the constructor are in the attacker’s control, which means that the deserialization of ParcelableException allows an attacker to call an arbitrary constructor with an arbitrary string argument.

CVE-2017-0871: Not so arbitrary Java code execution

As the serialization code in ParcelableException does enforce type checking on the Class that is written into the Parcel, an attacker still needs to modify the Parcel directly to exploit this vulnerability. However, this is not difficult and can be done in a similar way to Jann Horn’s PoC. In my PoC of this vulnerability, I call the constructor of FileOutputStream to truncate an arbitrary file as the system user.

When Kevin Backhouse and I discovered this vulnerability in November, we reported it to the Android Security Team, but were later told that the vulnerability had already been reported by another researcher on August 30. At the time, the patch wasn’t yet merged into the public repository of AOSP. Eventually, the vulnerability was fixed with the December security update and the source code was updated with a check put in place:

final Class<?> clazz = Class.forName(name, true, Parcelable.class.getClassLoader());
  if (Throwable.class.isAssignableFrom(clazz)) {
    return (Throwable) clazz.getConstructor(String.class).newInstance(msg);
  }

Conclusion

In this post I reviewed a number deserialization vulnerabilities in Android and showed how the flaw in Android’s IPC mechanism led to four different exploitable vulnerabilities. In particular, I showed how to use a simple query to find Java classes that manage native memory and I identified three of these vulnerabilities. I then showed how to use CodeQL to discover a fourth vulnerability that allows an arbitrary Java constructor to be called.

Image credit: milkos / 123RF Stock Photo