January 15, 2020

Review of Chromium IPC vulnerabilities

Man Yue Mo

In this post we will delve into the interesting world of Chromium IPC research. I will discuss some of my own findings as well as recent findings from the wider security research community, in an effort to uncover recurring vulnerability themes in the Chrome IPC attack surface. Thanks to Ned Williamson’s recent work on fuzzing IPC interfaces directly with libprotobuf-mutator 1, as well as Mark Brand’s research 2 on how to call and subsequently fuzz these interfaces directly from Javascript, the Chromium IPC attack surface has proven to be a source of many vulnerabilities. As a result of my research, 6 new bugs were discovered: (CVE-2019-13688, 995964, CVE-2019-5876, 997190, CVE-2019-13700, 998431, CVE-2019-13687, 998548, CVE-2019-13699, 1001503, and CVE-2019-13695, 1004730).

All of these issues were disclosed and fixed between Chromium releases in September and October (M77, M78) of 2019. All these are rated High severity and if successfully exploited, would allow a compromised renderer to escape the Chromium sandbox.

As Chromium security issues get derestricted after 14 weeks and details of the bugs are written up in the tickets, I'll not go through the individual bug details in these posts, but rather provide you with a high level overview of my bug hunting strategy and explain how I used CodeQL to help find these bugs.

Readers who are interested in the bug details can refer to the tickets themselves. All of these issues, with the exception of 1001503 (which was found by ‘accident’ when I was experimenting with IPC calls), were found by manual code review with the help of CodeQL.

I'll first briefly explain Chromium’s sandbox model and the impact of these IPC issues. After that, I'll take a look at several vulnerabilities of this type and classify them into different groups. This will then help me to focus my effort and discover new vulnerabilities.

Chromium multi-process architecture

The Chromium multi-process architecture is well-documented, so I'll be brief here. The Chromium browser runs in different processes, each of them has different privileges and is responsible for different tasks. This architecture not only provides more stability for the browser, e.g. a renderer crash will not affect the browser or any other renderer, but also allows for a more granular privilege model by applying sandboxes to different processes using OS level sandboxing. Different processes communicate with each other via IPC messages and low privilege processes can request high privilege processes to perform a certain task by sending IPC messages.

From a security point of view, there are two main process contexts: The renderer process pool and the browser process. The renderer process pool is a set of low privilege processes where v8 and blink etc. run. In general, the renderer processes have the lowest privilege of all Chromium processes and they are heavily sandboxed. A remote code execution vulnerability in the renderer processes will, in general, need to be chained with another vulnerability in the browser process in order to escape the sandbox. There are exceptions, though, for example on Android the Android binder process and some Android services can be accessed inside the renderer sandbox and vulnerabilities in these processes can be used to escape the sandbox as well, see, for example, here 3, here 4, and here 5. Another notable exception is a vulnerability disclosed in March 2019 that allowed the renderer to directly exploit a bug in the win32k.sys kernel driver on Windows 7.

For the purpose of this post, I'll focus on vulnerabilities that are triggered via the IPC channels.

Chromium IPC interfaces

There are two main channels for IPC between processes in Chrome. One is the Mojo interface, which is newer and more common. There is also an older IPC interface, which is less common but still in use.

Mojo Interface

Details of the Mojo interface can be found here. In this section I'll focus more on how the interface is represented in the code and where to look for it, rather than repeating the documentation.

The Mojo interfaces are defined in .mojom files in the Chromium source code. When building Chromium, these files are used to generate the C++ source code and the Javascript bindings. You can find these generated files in the src/out/\<target\>/gen/ directory of the Chromium checkout after you build it. Mark Brand of Project Zero has a convenient script that he used to extract the Javascript bindings in his bug tickets, which I shamelessly stole and used in my bug reports as well =P.

Implementations of these Mojo interfaces in the browser process are mainly found under the content/browser directory. Though there are some exceptions, for example, the PaymentRequest interface is in /src/components/payments.

To access these interfaces in Javascript, the Mojo.bindInterface method can be used. For example, in this Project Zero ticket filed by Mark Brand, the PaymentRequest interface is used as follows:

var payment_request = new payments.mojom.PaymentRequestPtr();
    Mojo.bindInterface(payments.mojom.PaymentRequest.name,
                       mojo.makeRequest(payment_request).handle);

The first line creates a PaymentRequestPtr, which serves as a proxy for the renderer side that can be bound to the PaymentRequest implemented in the browser process. In the second line, Mojo.bindInterface is used to bind payment_request to the payments.mojom.PaymentRequest interface, that is running in the browser process. After which, the methods defined by this interface can be called from Javascript. For example

      payment_request.init(
        payment_request_client,
        [],
        payment_details,
        payment_options);

will invoke the Init method in PaymentRequest.

Another way to test/fuzz the IPC interfaces is to interact with them directly by writing small unit tests like fuzz harnesses. For example, the AppCacheFuzzer written by Ned Williamson uses the libprotobuf-mutator to create IPC messages to fuzz the component directly. This is generally done by creating a fuzz harness of the targeted components, setting up the correct environment in the fuzzer, then supplying a .proto file that defines the messages used for fuzzing the interface. For example, in the case of AppCache, the Command defines the different methods exposed in the mojo interface:

// Based on blink::mojom::AppCacheBackend and blink::mojom::AppCacheHost
// interfaces.
// See third_party/blink/public/mojom/appcache/appcache.mojom
message Command {
  oneof command {
    RegisterHost register_host = 1; //<-- AppCacheBackend::RegisterHost

For each command, the exact way to use it is defined in another message with the same name:

message RegisterHost {
  required HostId host_id = 1; //<-- Takes an argument of type HostId
}

With this specified, libprotobuf-mutator will be able to generate calls with the correct argument types to fuzz the component.

Old IPC interface

While the old IPC interfaces are less common these days and many of them are being migrated to the Mojo interface, they are still in use and one of my bugs is in fact reached via the old interface.

In general, a class that uses this interface will define a OnMessageReceived method, inside of which different messages are mapped to different handlers:

MyClass::OnMessageReceived(const IPC::Message& message) {
  ...
  IPC_MESSAGE_HANDLER(ViewHostMsg_MyMessage, OnMyMessage); //<-- OnMyMessage is the handler
  ...
}

The OnMyMessage handler behaves like a method in the Mojo interface, which handles an IPC message from another process.

To send a message to another process from the renderer, the Send method can be used:

Send(new ViewHostMsg_MyMessage());

I do not know of any Javascript interface for these and I usually patch the renderer to send messages and test these interfaces.

Analysis of previous bugs

In this section I'll take a look at some vulnerabilities reported in the first half of the year. I'll focus on simpler issues that are caused by interactions between raw pointers and unique pointers, as well as issues with callbacks. There are some other fairly complex issues due to interactions between shared pointers and raw pointers which Ned Williamson discovered in 2018, but they are beyond the scope of this article.

I will review the following vulnerabilities::

Mark Brand (Project Zero): P0_1730, P0_1735, P0_1743, P0_1755, P0_1754, P0_1767, P0_1803

Brendon Tiszka: 977462

Guang Gong (Qihoo 360 Alpha): 956597

Gengming Liu, Jianyu Chen, Zhen Feng, Jessica Liu (Tencent Keen Security Lab): 941746

On the face of it, many of these issues are object lifetime management issues. References to a raw pointer get freed by a (compromised) renderer and were then attempted to be used in the browser. This categorization, however, is far too general and does not really help us with looking for vulnerabilities. So in the end, I decided to classify the issues into smaller and more focused categories. While this isn’t exactly root cause analysis, as my classifications aren’t really the cause of the issues, this approach helped me to understand the issues better and to find bugs.

Category 1: Non-trivial raw pointer field management

This is a very broad category. On its own it really isn't a bug, let alone a vulnerability. However, due to the often complex situations that can arise in use-after-free scenarios, I don't have any satisfactory way to further characterize these types of vulnerabilities, so this category ended up being my kitchen sink for raw pointer field issues.

To clarify what I mean when I say non-trivial vs trivial raw pointer field management, here are some examples of what I consider to be trivial, or straightforward, raw pointer field management scenarios:

  1. Raw pointer points to owner:

This is very common. Often raw pointer fields are in fact pointing to the owner of the object, in which case it is very unlikely to encounter object lifetime issues, for example:

class A {
  ...
  A() {
    b_ = std::make_unique<B>(this);
  }
  std::unique_ptr<B> b_;
}

class B {
  ...
  B(A* a) {
    a_ = a;
  }

  A* a_;
}

If all instances of B are constructed in the constructor of A, then a_ will most likely point to the owner of the object unless it gets reassigned, which is very rare.

  1. Destructor of raw pointer class removes its reference.

This is also very common. Normally the class that holds a raw pointer may "observe" the lifetime of the pointed-to object and when it gets deleted, the raw pointer also gets removed. For example:

class A extends BObserver {
  ...
  A(B* b) {
    b_ = b;
    b_->AddObserver(this);
  }

  ~A() {
    if (b_) {
      b_->RemoveObserver(this);
    }
  }

  void OnBDestroyed(B* b) {
    if (b == b_) {
      b_ = nullptr;
    }
  }
  B* b_
}

class B {
  ...
  ~B() {
    for (BObserver* observer : observers_) {
      observer->OnBDestroyed(this);
    }
  }
}

One example of this kind of protection method is the FrameServiceBase, which has a raw pointer to RenderFrameHost but observes its lifetime to prevent use-after-free.

Raw pointer fields that are not trivially protected are often managed by more complicated logic but are mostly ok. Every now and again, however, there may be flaws in the clean up logic which can lead to vulnerabilities.

For example, P0_1735. In this case, the class PaymentSheetViewController contains two raw pointer fields, spec_ and state_. In this case, spec_ and state_ are owned by the PaymentRequest object that cannot be deleted without deleting the PaymentSheetViewController. This, however, does not guarantee that the fields spec and state in PaymentRequest cannot be reset. In fact, these fields can be reset in the PaymentRequest::Init method which can be called directly from the mojo interface. Thus a compromised renderer can easily release the backing objects of spec_ and state_, causing use-after-free.

Other known issues in this category include: P0_1754 (also involved integer overflow), P0_1767 (also involved integer overflow), P0_1803, 941746, 956597 and I've discovered 997190, 995964 and 1004730 which are also this type of issue.

Category 2: Callback storing raw pointer

Callbacks are used extensively in IPC related code. A callback in Chrome is usually created by the base::BindOnce or base::BindRepeating functions. When callbacks are created, arguments can be bound to the callback. Different types of bindings specify how the bind states are managed. The more dangerous one is the base::Unretained binding:

base::Bind(&MyClass::Foo, base::Unretained(ptr));

This indicates that the callback does not own ptr and that it is the caller's responsibility to ensure that ptr is alive when the callback is executed. While dangerous, this is a somewhat well known issue and developers are usually aware of the consequences. Many uses of Unretained have comments that justify its use. Examples of this issue are P0_1743 and P0_1755 and my finding 998548 is in this category as well.

Category 3: RenderFrameHost lifetime issues

RenderFrameHost is the representation of a frame in the browser process. It is a long living object that out-lives many IPC services. However, in some cases, services or methods that bind a service can outlive the RenderFrameHost. For example, in P0_1730, a callback, created with BindRepeating, is posted to the IO thread with an unretained raw pointer of a RenderFrameHost. As RenderFrameHost is created and destroyed in the UI thread, this leads to a race condition where the RenderFrameHost is destroyed in the UI thread, while the callback is still accessing it in the IO thread after its destruction. Another example, where a service that outlives RenderFrameHost retains a raw pointer to it, is 977462. I've not discovered any vulnerabilities in this category.

Conclusions

In this post I've reviewed several Chromium IPC vulnerabilities that were reported in the first half of 2019. The Chromium specific QL libraries and queries used in this research are published here. Hopefully this will help you in your own bug hunting efforts as well!