On September 30, 2021, Google released version 94.0.4606.71 of Chrome. The release note specified that two of the fixed vulnerabilities, CVE-2021-37975 and CVE-2021-37976, are being exploited in the wild. In this post, I’ll analyse CVE-2021-37975 (reported by an anonymous researcher), which is a logic bug in the implementation of the garbage collector (GC) in v8 (the JavaScript interpreter of Chrome). This bug allows reachable JavaScript objects to be collected by the garbage collector, which then leads to a use-after-free (UAF) vulnerability for arbitrary objects in v8. In this post, I’ll go through the root cause analysis and exploit development for the bug.

As usual, the disclaimer here is that, as I don’t have access to any information about the bug that isn’t public and I didn’t report the bug, the analysis is purely technical and offers my personal perspective. It is likely that my analysis will be very different from how the bug is actually exploited in the wild. As the bug has been patched for some time, I believe it’s now safe to publish an analysis of it. I’m hoping that the technical analysis will help provide some insight into the potential impact of garbage collector bugs in v8 and help prevent variants of similar bugs in the future.

The garbage collector is one of the v8 components that I dread most. It’s a fairly complex component that requires a relatively large time investment to study. The garbage collector also runs every time a JavaScript file is executed, so it has probably been run more than any other component in v8. On top of that, it introduces a great deal of uncertainty in the heap layout. Every time the garbage collector runs, it moves objects around and changes the heap layout completely, and it’s difficult to predict when it is going to run. There’s also concurrent garbage collection, which adds even more uncertainty to the mix.

I assumed that the chance of having a bug in the garbage collector that has not been caught by a fuzzer would be quite small and only looked at it when I had to. But when CVE-2021-37975 came out, I decided to take a deeper look into the internal workings of the garbage collector. I’m very glad I did this, because this bug has proven me wrong on so many levels and I’ve learned a lot from analysing it.

The garbage collector in v8

A very good and concise reference for the garbage collection algorithm can be found in the article Concurrent marking in v8 by Ulan Degenbaev, Michael Lippautz and Hannes Payer. The “Background” section covers what we need for this post. It’s a very short and well-illustrated section, so instead of repeating what it says, I refer you to that article and will just give a simple summary and introduce some debugging tools here. Another good and more in-depth reference is Trash talk: the Orinoco garbage collector by Peter Marshall.

The garbage collector in v8 is a mark-and-sweep garbage collector. When the garbage collector runs, it’ll first “visit” global objects and variables that are within the current scope, that is, objects that are known to be alive when the garbage collector runs. Objects that are believed to be alive but yet to be visited are stored in a worklist. In practice, “visit” means to call one of the Visit* methods. These methods will perform appropriate actions depending on the type of the object. For example, when visiting a JavaScript object, the visit method will mark the object as alive, and then go through other objects that are directly reachable from this object (properties, elements, etc.) and push them into the worklist, ready to be visited in the next iteration. Once an object in the worklist is processed, it will be marked as black, while objects that got pushed onto the worklist are marked as grey. Objects that are not yet reachable are unmarked and are coded as white objects. The iterations continue until no more new reachable objects are found. After that, all unmarked (white) objects are assumed to be unreachable and will be collected.

The v8 garbage collector is also a generational garbage collector. This means that the v8 heap is divided into the old space and the new space. These two spaces are used for storing small objects. There is also another space, the large object space, that is designated for large object allocations (objects of at least 1 << 17 bytes). When a small object is allocated, it is allocated in the new space. If the object survived a garbage collection, then it gets moved to the old space. This division allows the garbage collector to perform two different types of garbage collection. The minor gc(also known as scavenge) only looks for objects to collect in the new space, while the major gc(also known as mark-and-sweep or mark-compact) collects objects in all spaces. This allows the garbage collector to first look for the opportunity to free up memory in the new space before moving onto the other spaces in order to improve the performance of garbage collection.

Apart from a full garbage collection (implemented in mark-compact.cc), there is also concurrent marking, (implemented in concurrent-marking.cc). The main differences is that concurrent marking will only perform a few iterations of markings whereas mark-compact will carry out the markings until no new reachable objects are found and then perform garbage collection. As you’ll see, concurrent marking will introduce some uncertainties that need to be dealt with when triggering the use-after-free bug.

Some useful v8 flags that provide additional information for garbage collection are: trace-gc, trace-concurrent-marking, trace-unmapper and trace-gc-verbose. These provide useful logging information about when garbage collection and concurrent marking are triggered. The trace-unmapper flag will trace when some part of the heap gets freed after a garbage collection. I’ll go through this in more detail later.

For example, the trace-gc flag will show when garbage collector is triggered, and which type of garbage collection it is:

[737676:0x55ca9102b6e0]       18 ms: Scavenge 13.1 (22.7) -> 13.1 (30.7) MB, 0.8 / 0.0 ms  (average mu = 1.000, current mu = 1.000) allocation failure 
[737676:0x55ca9102b6e0]       33 ms: Mark-sweep 13.8 (31.6) -> 13.8 (31.6) MB, 13.8 / 0.0 ms  (+ 0.2 ms in 3 steps since start of marking, biggest step 0.2 ms, walltime since start of marking 14 ms) (average mu = 0.383, current mu = 0.383) finalize incremental marking via stack guard GC in old space requested
[737676:0x55ca9102b6e0]       37 ms: Scavenge 21.8 (40.1) -> 21.8 (40.1) MB, 0.9 / 0.0 ms  (average mu = 0.383, current mu = 0.383) allocation failure 

In the above, a mark-sweep, which indicates a major gc that will mark and collect objects in the whole heap, is triggered, while a scavenge indicates a minor gc.

A garbage collection may also be accompanied by some UnmapFreeMemoryJob, which will either remove access permissions to the pages that are freed or return them to the system. These tasks can be traced using the trace-unmapper flag. As you’ll see, these unmap jobs will affect the reliability of the exploit.

[785510:0x56034648b270] Unmapper::FreeQueuedChunks: new Job
[785510:0x56034648b270] Unmapper::PerformFreeMemoryOnQueuedChunks: 1 queued chunks
[785510:0x56034648b270] UnmapFreeMemoryTask Done
[785510:0x56034648b270] Unmapper::PerformFreeMemoryOnQueuedChunks: 1 queued chunks
[785510:0x56034648b270] UnmapFreeMemoryTask Done

WeakMap and ephemerons

So far, so good, until we encounter weak references in JavaScript. In JavaScript, weak references can be defined using WeakMap, WeakSet or WeakRef. For this post, only WeakMap is relevant so I’ll explain it here and show how it affects garbage collection. A WeakMap offers a way to store an object into a map without preventing it from being garbage-collected. For example, in the following code:

var wm = new WeakMap();
{
  let key = {k : 1};
  let value = {v : 1};
  wm.set(key, value);
}
//key goes out of scope 

… I have a key and value, both stored inside the WeakMap wm. Once I’m outside the block that defines key and value, they are out of scope and the only references left to them are held in wm. In this case, when garbage collection happens, both key and value will be collected, even though wm is still alive and holds a reference to both. In this case, we say WeakMap is holding weak references to key and value because the references it holds to them does not prevent them from being collected. Note that this will not lead to garbage collected objects being reachable in JavaScript because once key and value are collected, there is no way to recover them from WeakMap. As long as the key to a value in the WeakMap is alive, the value is considered reachable and will not be collected. For example, in the following case

var wm = new WeakMap();
let key = {k : 1};
{
  let value = {v : 1};
  wm.set(key, value);
}
//value goes out of scope

… because key is still alive outside of the code block, value will not be garbage collected. In a nutshell, a value in a WeakMap can be collected if and only if both its key and itself have no other references outside of the WeakMap. This kind of relationship is called ephemeron and the key-value pair is called an ephemeron pair.

So WeakMap and ephemeron certainly complicate things for the garbage collector. If the garbage collector comes across a WeakMap, it cannot decide whether its keys and values can be collected until it has processed all other objects and figured out whether any of them are referencing these keys and values. So how does the garbage collector do it? The WeakMap stores its keys and values in the field table, which is a hash table of EPHEMERON_HASH_TABLE_TYPE:

d8> x = new WeakMap()
d8> %DebugPrint(x)
DebugPrint: 0xeac0804944d: [JSWeakMap]
 - map: 0x0eac08202b41 <Map(HOLEY_ELEMENTS)> [FastProperties]
 - prototype: 0x0eac081c7741 <Object map = 0xeac08202b69>
 - elements: 0x0eac0800222d <FixedArray[0]> [HOLEY_ELEMENTS]
 - table: 0x0eac0804945d <Other heap object (EPHEMERON_HASH_TABLE_TYPE)>
....

When the garbage collector visits an EPHEMERON_HASH_TABLE_TYPE object, the VisitEphemeronHashTable method is used.

int MarkingVisitorBase<ConcreteVisitor, MarkingState>::VisitEphemeronHashTable(
    Map map, EphemeronHashTable table) {
  ...
  for (InternalIndex i : table.IterateEntries()) {
    ...
    if (concrete_visitor()->marking_state()->IsBlackOrGrey(key)) {
      VisitPointer(table, value_slot);           //<-------------------- 1.
    } else {
      Object value_obj = table.ValueAt(i);      //<--------------------- 2.

      if (value_obj.IsHeapObject()) {
        HeapObject value = HeapObject::cast(value_obj);
        ...

        // Revisit ephemerons with both key and value unreachable at end
        // of concurrent marking cycle.
        if (concrete_visitor()->marking_state()->IsWhite(value)) {
          weak_objects_->discovered_ephemerons.Push(task_id_,
                                                    Ephemeron{key, value});
        }
      }
    }
  }
  return table.SizeFromMap(map);
}

The important parts are in the branches labelled 1. and 2. in the above snippet. If, by the time the WeakMap is visited, a key is already found to be reachable (marked as black or grey), then it’ll treat the corresponding value as reachable also and proceed with VisitPointer. On the other hand, if the key is not yet marked, then it cannot decide whether the value is unreachable or not, because the key may yet be marked later when more objects are visited. In that case, both the key and the value are stored in discovered_ephemerons to be processed when all other objects are marked.

At the end of a garbage collection cycle, all the ephemeron pairs discovered are processed in the ProcessEphemeronMarking method. This will first try to process the ephemerons with an iterative algorithm similar to how live objects are being marked. Essentially, it repeatedly calls the ProcessEphemeron method on all the discovered ephemerons:

bool MarkCompactCollector::ProcessEphemeron(HeapObject key, HeapObject value) {
  if (marking_state()->IsBlackOrGrey(key)) {
    if (marking_state()->WhiteToGrey(value)) {
      local_marking_worklists()->Push(value);
      return true;
    }
  } else if (marking_state()->IsWhite(value)) {
    weak_objects_.next_ephemerons.Push(kMainThreadTask, Ephemeron{key, value});
  }

  return false;
}

If any of the keys are marked by this stage, then their values get marked as grey and pushed into the worklist, and the garbage collector will revisit them as normal strongly referenced objects once all the discovered ephemerons are processed. If, on the other hand, the key is still unmarked at this stage, the ephemeron pair will be queued in the next_ephemerons worklist to be processed on the next iteration. The algorithm roughly goes as follows:

Process discovered_ephemerons
  -update local_marking_worklists
  -update next_ephemerons
  -current_ephemerons -> next_ephemerons

Do until no new object is added to local_marking_worklists
  Process current ephemerons
    -update local_marking_worklists
    -update next_ephemerons
  Process local_marking_worklists
    -update discovered_ephemerons
  Process discovered_ephemerons
    -update local_marking_worklists
    -update next_ephemerons
  current_ephemerons -> next_ephemerons

The implementation of the algorithm inside the loop can be found in ProcessEphemerons. This carries on until no new object is marked or until the maximum number of iterations is reached. If it exits because maximum iteration is reached, it then uses a different algorithm for marking.

The vulnerability

Now that I’ve covered some concepts related to the garbage collector and ephemerons, we can take a look at the patch. The patch contains changes to a number of files, but the most crucial change is the following:

diff --git a/src/heap/mark-compact.cc b/src/heap/mark-compact.cc
index 85e9618..dae343c 100644
--- a/src/heap/mark-compact.cc
+++ b/src/heap/mark-compact.cc
...
 bool MarkCompactCollector::ProcessEphemerons() {
   Ephemeron ephemeron;
 
   // Drain marking worklist and push discovered ephemerons into
   // discovered_ephemerons.
-  DrainMarkingWorklist();
+  size_t objects_processed;
+  std::tie(std::ignore, objects_processed) = ProcessMarkingWorklist(0);
+
+  // As soon as a single object was processed and potentially marked another
+  // object we need another iteration. Otherwise we might miss to apply
+  // ephemeron semantics on it.
+  if (objects_processed > 0) another_ephemeron_iteration = true;

As well as similar changes in concurrent-marking.cc. After the change, the algorithm does not just stop when local_marking_worklists becomes stable, it’ll also check if any object in local_marking_worklists gets processed. Along with the change, the comment suggests that, if an object in local_marking_worklists gets processed, then it can potentially mark some keys to previously unreachable ephemerons. If the algorithm then terminates, the status of these ephemerons may not be updated and their values may be collected. For example, if at the start of an iteration, all the key-value pairs in current_ephemerons are marked white, but after local_marking_worklists is processed one of these keys is marked (let’s call it k1, with value v1) but no new ephemeron is discovered, then discovered_ephemerons will be empty and the iteration will finish with all the ephemeron pairs in current_ephemerons still unmarked. This will cause v1 to be collected. (However, k1 will not be collected because it is marked.)

To understand how to arrive at such a situation, I’m going to work backwards and try to reconstruct the state of the discovered_ephemerons at the start of the processing. Note that, at the start of the ephemeron processing algorithm, all the non-ephemeron objects are processed and the local_marking_worklists is empty:


        PerformWrapperTracing();
        DrainMarkingWorklist();           //<------- will empty local_marking_worklists
      } while (!heap_->local_embedder_heap_tracer()->IsRemoteTracingDone() ||
               !local_marking_worklists()->IsEmbedderEmpty());
      ...
    }
    ...
    {
      ...
      ProcessEphemeronMarking();
      ...

… and current_ephemerons is only populated after the initial discovered_ephemerons are processed. So in the beginning of the processing, the only non-trivial list is the discovered_ephemerons.

Now let’s assume we’re at the final iteration of the algorithm and we’d like the local_marking_worklists to process new objects. This means that, at the start of this iteration:

  1. The list current_ephemerons cannot add any object to local_marking_worklists, otherwise another iteration will be needed. This means that all keys in current_ephemerons are white at this point.
  2. As current_ephemerons does not add to local_marking_worklists, and local_marking_worklists is drained in each iteration before discovered_ephemerons is processed:
     bool MarkCompactCollector::ProcessEphemerons() {
       ...
       // Drain marking worklist and push discovered ephemerons into
       // discovered_ephemerons.
       DrainMarkingWorklist();             //<------- empty local_marking_worklists
       ...
       while (weak_objects_.discovered_ephemerons.Pop(kMainThreadTask, &ephemeron)) {
         if (ProcessEphemeron(ephemeron.key, ephemeron.value)) {
           ephemeron_marked = true;
         }
       }
       ...
       return ephemeron_marked;
     }
    

    … this means that all objects in the current local_marking_worklists must be added in the previous iteration when the previous discovered_ephemerons is processed.

  3. The list discovered_ephemerons will be empty at the beginning of the iteration but may have new objects after local_marking_worklists is drained. All the newly discovered ephemerons must have white keys or a new iteration will be started.

For example, the following figure shows a possible state where processing local_marking_worklists will mark a key (k1) in current_ephemerons while exiting the iteration. This will then cause v1 to be collected while still reachable via k1.

figure

The figure shows the state of each list at the beginning of being processed (instead at the beginning of the iteration, because the lists get filled/emptied within the iteration). A solid arrow indicates a strong reference and the color of the boxes indicates the color of the object when the list is processed. So for example, the figure shows a current_ephemerons list with two pairs of ephemerons, both with unmarked keys and values, and a local_marking_worklists with an object v3 that comes from an ephemeron pair (k3, v3) from discovered_ephemerons in the previous iteration, where v3 was unmarked and contained a strong reference to the key k1. After local_marking_worklists is processed, both v3 and k1 will be marked and the object v1 in current_ephemerons will be reachable via v3->k1->v1, but v1 will still be unmarked because there is no further processing of ephemerons.

This gives a possible state of each list at the final iteration for the vulnerability to trigger, so let’s move on to the previous iteration. As noted before, for local_marking_worklists to contain objects, discovered_ephemerons must be non empty in this iteration and contain at least a pair (k3, v3) where v3 is unmarked. However, the key, k3 of this pair, must be marked in order for this pair to be pushed into the local_marking_worklists when ProcessEphemeron is called.

This means that, at this iteration, local_marking_worklists must:

  1. Contains a WeakMap, say, wm, so that its keys and values are added to the discovered_ephemerons.
  2. One of the key pairs, say, (k3, v3) must have a marked key but unmarked value.

Let’s assume condition one is satisfied and take a look at what happens when wm is processed in local_marking_worklists. Recall that, when a WeakMap is processed, VisitEphemeronHashTable is called:

int MarkingVisitorBase<ConcreteVisitor, MarkingState>::VisitEphemeronHashTable(
    Map map, EphemeronHashTable table) {
  ...
  for (InternalIndex i : table.IterateEntries()) {
    ...
    if (concrete_visitor()->marking_state()->IsBlackOrGrey(key)) {
      VisitPointer(table, value_slot);                     //<----------- mark.
    } else {
      Object value_obj = table.ValueAt(i);                 //<------------discover.
        ...
        if (concrete_visitor()->marking_state()->IsWhite(value)) {
          weak_objects_->discovered_ephemerons.Push(task_id_,
                                                    Ephemeron{key, value});
        }
      }
    }
  }
  return table.SizeFromMap(map);
}

So for each key-value pair in wm, if the key is already marked, then it’ll proceed to VisitPointer instead of adding the pair to discovered_ephemerons, which means that if a pair is to be added to discovered_ephemerons, then its key must not be marked, which seems contradictory to condition two above. This apparent contradiction is probably why this bug remains so well hidden.

But join me in thinking about it a bit more. The code in VisitEphemeronHashTable means that the pair (key, value) is added to discovered_ephemerons only if key is unmarked at the time when key is being processed, whereas condition two is the condition after the entire local_marking_worklists is processed. So if, for example, there are two key-value pairs in wm, (k3,v3) and (k4,k3) so that, k3, v3 are unmarked, but k4 is marked, and (k3,v3) is visited before (k4,k3) in the EphemeronHashTable, then when (k3,v3) is visited, it’ll be added to discovered_ephemerons because both k3 and v3 are unmarked. Then, when (k4,k3) is visited, k3 will be marked, so after all entries are visited, (k3,v3) will be in discovered_ephemerons with k3 marked and v3 unmarked!

figure

Now for wm to be in local_marking_worklists, current_ephemerons must contain an ephemeron pair (kwm, wm) where kwm is marked so that the pair gets pushed to local_marking_worklists when current_ephemerons is processed. So the state for the last two iterations should look like:

figure

In JavaScript, this is achieved as follows:

var rootWm = new WeakMap();
var kwm = {};
var k4 = {};
{
  let k1 = {};
  let k2 = {};
  let k3 = {};
  let v3 = k1;
  wm.set(k3, v3);
  wm.set(k4, k3);
  rootWm.set(k1, v1); 
  rootWm.set(k2, v2);
  let wm = new WeakMap();
  rootWm.set(kwm, wm);
}
//v1 can be retrieved as follows:
let wm = rootWm.get(kwm);
let k1 = wm.get(wm.get(k4));
let v1 = rootWm.get(k1);

This, however, will not produce a use-after-free, for a number of reasons:

  1. There are actually two rounds of ProcessEphemeronMarking

     ...
       ProcessEphemeronMarking();         //<---------- Round one
       DCHECK(local_marking_worklists()->IsEmpty());
     }
     ...
     // Process finalizers, effectively keeping them alive until the next
     // garbage collection.
     {
       ...
       DrainMarkingWorklist();
     }
        
     // Repeat ephemeron processing from the newly marked objects.
     {
       ProcessEphemeronMarking();        //<----------- Round two
     ...
    
  2. Prior to garbage collection, concurrent marking is performed. This seems inevitable and will perform some iterations of markings. So by the time the garbage collection starts, current_ephemerons will be moved to some different states, which would also cause the unmarked ephemerons to be discovered.

Problem one is fairly trivial to resolve. In the current setup, by the end of the first ProcessEphemeronMarking, we’ll have two ephemeron pairs, (k1,v1) and (k2,v2), in which k1 is marked while the other objects are unmarked. These will be the pairs in current_ephemerons at the start of the second ProcessEphemeronMarking phase. By comparing this to the state at the start of the first ProcessEphemeronMarking, which has the pairs (kwm,wm), (k1,v1) and (k2,v2), with all objects except kwm unmarked, you can see that all I need to do is to repeat the construction, with (k1,v1) taking the place of (kwm,wm) and (k2,v2) taking the place of (k1,v1). This will cause the object v2 to be reachable while unmarked by the end of the second ProcessEphemeronMarking phase.

To solve the second problem, I’ll need to create an object graph that will return to a similar state after each iteration. This can be achieved, for example, by nesting WeakMap objects. Consider the following objects:

figure

I have a WeakMap wm1 that contains several other WeakMaps inside it. Suppose at the beginning of an iteration, current_ephemerons consists of the entries discovered in wm1 (for example, if wm1 is a variable that is alive, then its entries will be discovered after it gets visited), then after current_ephemerons are processed, wm2 will be added to local_marking_worklists because k2 is marked, while the other entries will be pushed to the next current_ephemerons list, waiting to be proceessed in the next iteration. After local_marking_worklists is processed, k3 will be marked because k5 is marked. So in the beginning of the next iteration, current_ephemerons will contain the pairs (k3, wm3) and (k4, wm4) with k3 marked. This is a state similar to the start of the previous iteration, except there is now one less unmarked ephemeron pair. The following figure illustrates what happens in two iterations:

figure

As garbage collection will continue to iterate until no new object is marked, by using a nested WeakMap that is deep enough, I can ensure that current_ephemerons will be in the correct state after some iterations triggered by garbage collection, while concurrent marking would not iterate far enough to disturb our objects.

Once these issues are resolved, the UAF bug can be triggered. However, because the entries in an EphemeronHashTable are determined by the hash of the key, which is not deterministic, the key-value pair may not always be processed in the desired order. This reduces the success rate to trigger the bug, but since a failure will not result in a crash or memory corruption, by retrying enough times, the bug will eventually trigger.

Exploiting the bug

Now that I’ve shown how this bug may be triggered, let’s take a look at how to exploit it. As you saw in the previous section, this bug can be used to cause any object to be collected while it can still be referenced in JavaScript. This would result in a UAF for any JavaScript object of my choice, which is a very strong primitive. This, however, does not mean the bug is trivial to exploit. To free the object, I need to trigger garbage collection, which will not only collect a whole lot of other objects but also move surviving objects to the old space. This makes it very difficult to reliably replace the freed object to do something useful. So instead of trying to reclaim the freed object directly, I’m going to try to do a partial reclaim. The idea is that, while the garbage collector moves surviving objects to the old space, freed objects are not moved. If the object data does not get overwritten, then the freed reference will still be pointing to the same object, but in a corrupted state. If the freed object contains references to other objects that live in a different and more stable heap, then I can turn this into a UAF of such object and avoid working with the v8 heap.

Thinking along this line, there are two candidate objects that stand out:

  1. TypedArray/ArrayBuffer: These objects contain a pointer to their backing store, which is allocated with the PartitionAlloc. For UAF, this heap is much easier to exploit than the v8 heap. By using a TypedArray/ArrayBuffer as the object for this UAF, I can turn it into a UAF of the backing store in PartitionAlloc and follow the now-standard exploitation technique used in Operation Wizard Opium (See also The Journey from exploiting PartitionAlloc to escaping the sandbox: Chromium Fullchain - 0CTF 2020) to exploit this issue.
  2. Large JSArray: As pointed out in Exploiting CVE-2021-21225 and disabling W^X by Brendo Tiszka (a very good post that contains many techniques and which I’ll refer to multipe times in this post), large objects in the v8 heap are allocated in the large object space. As this space is not used often and the objects do not get moved during garbage collection, it’s more stable than the new space in the v8 heap. When allocating large JSArray, while the array itself is still allocated in the new space, the backing store of the array gets allocated in the large object space. By causing a UAF with large JSArray, I can turn this into a UAF of the backing store in the large object space. This could allow me to replace a large double array backing store with an object array backing store and cause type confusion.

The first option is probably the standard way to exploit this bug. The method is fairly well-documented and was used multiple times. As the second option does not seem to have been done before, I decided to follow that path and try exploit it with large JSArray.

Getting a type confusion

Initially, to test whether my assumptions were correct, I used TypedArray to trigger the UAF and check whether the backing store content had changed. When testing, I used the --expose-gc flag in v8 that allows me to trigger garbage collection with the gc function exposed by this flag. I got a reasonable success rate and was able to retrieve the freed TypedArray with different backing store content. This confirms that the partial replacement idea is viable and that objects do not get wiped after they are freed, while their backing stores can still be replaced. However, after removing the --expose-gc flag and triggering garbage collection by allocating a large ArrayBuffer (See “Trick #2: Triggering Major GC without spraying the heap” in Exploiting CVE-2021-21225 and disabling W^X), I kept getting SEGV_ACCER when I tried to access the freed TypedArray. It turns out that, when triggering garbage collection by large allocation, some UnmapFreeMemoryJob will also be spawned to free up pages that are no longer in use.

void MemoryAllocator::PerformFreeMemory(MemoryChunk* chunk) {
  ...
  VirtualMemory* reservation = chunk->reserved_memory();
  if (chunk->IsFlagSet(MemoryChunk::POOLED)) {
    UncommitMemory(reservation);
  } else {
    DCHECK(reservation->IsReserved());
    reservation->Free();
  }
}

Depending on the flag on a memory chunk, it’ll either be uncommitted (UncommitMemory), which will mark it as inaccessible, or freed (Free), which will both make it inaccessible and wipe all the data in the chunk. This means that the memory chunk where my freed object is must have been freed by these unmap jobs.

This problem somehow changed when I started spraying the heap with large JSArray after garbage collection. I no longer got SEGV_ACCERR. Instead, I got null dereferences and various assertion failures due to the JSArray object being invalid. Upon closer inspection, it seems that after spraying the heap, the page where the freed object is becomes accessible again, but all the data is wiped. My guess is that, by spraying the heap, pages that were previously freed get reclaimed and become accessible again–but because they’re freed, all the data is wiped.

After trying different ways of triggering garbage collection to see if I could avoid the UnmapFreeMemoryJob, I discovered that if the garbage collection is triggered by too large an allocation, then many pages will be freed instead of uncommitted. So for example, the following:

new ArrayBuffer(0x7fe00000);

will cause pages to be free’d, whereas a smaller allocation will only cause pages to be uncommitted:

new ArrayBuffer(0x4fe00000);

The above is still large enough to trigger garbage collection, but will be less likely to free pages. So by using a smaller allocation to trigger gc, and then spraying the heap with large JSArray, I was able to trigger the UAF with a large double array, and then replace its backing store with that of a large object array. This causes a type confusion between a double array and an object array and allows me to have two different “views” of an object. From this, I can obtain the address of an arbitrary object easily.

figure

This figure shows that, if my freed array is lDblArr, and I managed to replace its backing store with that of lObjArr, then obj1 and obj2 in lObjArr would be read as double in lDblArr. By converting the double back into bytes, I can read out the addresses of obj1 and obj2. Note that addresses of JavaScript objects are stored as 32-bit values due to pointer compression. Pointer compression is an optimization that only stores the lower 32 bits of the full 64-bit pointers in the heap, while the higher 32 bits remain constant and are cached in a register. These lower 32-bit addresses are referred to as compressed pointers.

This type confusion enables me to read the address of any JavaScript object and gives me the addressof primitive readily.

By taking the address of a small array, I can also use these two “views” to craft any fake object. A JSArray, which is the C++ representation of a JavaScript array, contains a field, elements, that points to its backing store. For small array, the backing store is inlined, which means it is allocated adjacent to the object, either before or after it.

d8> x = [1.1,1.1,1.1]
[1.1, 1.1, 1.1]
d8> %DebugPrint(x)
DebugPrint: 0x360308049471: [JSArray]
 - map: 0x360308203ae1 <Map(PACKED_DOUBLE_ELEMENTS)> [FastProperties]
 - prototype: 0x3603081cc0f9 <JSArray[0]>
 - elements: 0x360308049451 <FixedDoubleArray[3]> [PACKED_DOUBLE_ELEMENTS]

For example, in this case, the backing store (elements) is located in front of the JSArray at an offset of -0x20 (0x360308049451 - 0x360308049471). The offset of the backing store for an inlined array of particular type and length is predictable and can be found easily using %DebugPrint.

To fake an object, I can create an inlined double array, fakeObj, and read its address by storing it in lObjArr. This would allow me to compute the address of its backing store using the offset. Now if I store the address of the backing store in lDblArr, then lObjArr will be treating the double elements of fakeObj as the memory content of an object:

let fakeObj = [1.1,1.1,1.1];
lObjArr[0] = fakeObj;
lObjArr[1] = fakeObj;

let addressFloat = lDblArr[0];  //<---- reads address of fakeObj as double
let address = ftoi32(addressFloat)[0]; //<---- converts to 32 bit integer

let elements = address -0x20; //<-------- backing store address
lDblAddr[0] = i32tof(elements, elements) //<------- converts element addresses back to float and store in lDblArr
var obj = lObjArr[0];  //<--------- elements being read as Javascript object:
                       //           faked object with content treated as: 
                       //           map: 0x9999999a prototype: 0x3ff19999 ...(hex representation of 1.1)

By writing into the fakeObj array, I can craft an object with content that I control. As v8 determines the type of an object by reading its first field, map, I can create a fake double array if I know the value of the double array map:

fakeObj[0] = i32tof(doubleArrayMapAddr, doubleArrayMapAddr);
fakeObj[1] = i32tof(fakeElementStoreAddr, fakeLength);

%DebugPrint(lObjArr[0]);

DebugPrint: 0x3bed08045b09: [JSArray]
 - map: 0x3bed08203ae1 <Map(PACKED_DOUBLE_ELEMENTS)> [FastProperties]
 - prototype: 0x3bed081cc0f9 <JSArray[0]>
...

At this stage, the type confusion between my large double array and large object array gives me both the addressof and fakeobj primitive. However, it does not offer me an obvious way to leak the address of a map, which means I still would not be able to create any useful fake object. While the object and double backing store readily allows me to perform out-of-bounds access in the large object space, the space is mostly just used for allocating backing stores of large arrays, which means it mostly just contains addresses of JavaScript objects, so that doesn’t give me any more information either.

Bypassing the need to get an infoleak

Just as I was about to give up and try the TypedArray/ArrayBuffer option, I noticed something interesting. The compressed address of the JavaScript double array map does not seem to change at all, even after launching v8 many times:

d8> x = [1.1,1.1,1.1]
[1.1, 1.1, 1.1]
d8> %DebugPrint(x)
DebugPrint: 0x2c0a08049471: [JSArray]
 - map: 0x2c0a08203ae1 <Map(PACKED_DOUBLE_ELEMENTS)> [FastProperties]
... //restarting d8
d8> x = [2.3,3.3,4.5,6.7]
[2.3, 3.3, 4.5, 6.7]
d8> %DebugPrint(x)
DebugPrint: 0xc3908049479: [JSArray]
 - map: 0x0c3908203ae1 <Map(PACKED_DOUBLE_ELEMENTS)> [FastProperties]

In the above, the lowest eight digits is the compressed address. Notice that in both cases, the compressed address of map is the same: 0x8203ae1. This is true across many different launches of d8. The compressed address is the same, and only the top 32 bits of the address is different. Thinking that it might be something to do with the standalone version of v8 (d8), I tried it on Chrome itself. While the compressed address of the map is different from that on d8, it is still consistent across different launches of Chrome. Thinking that it might be something to do with my computer, I tried it on a different computer, and the address of the map was the same across two different devices, relaunches, and reboots. So it looks like the map is allocated at a fixed offset that depends only on the version of v8 and whether it is launched from d8 or on Chrome (on Chrome, the address is also different depending whether it is in a worker or a dom window, but otherwise stable). I’m not totally sure why this is the case, but my guess is that it may be due to an optimization called heap snapshot, which initializes the heap with a ready made “snapshot” that contains all the commonly used objects to speed up the launch, and that object maps are part of the snapshot. As the snapshot is typically part of the binary, it is conceivable that the offsets of objects in the snapshot will be constant.

Getting code execution

Now that I’ve got the address for the map of a double array, I can create a fake double array and set its elements to an arbitrary compressed address. Accessing elements of this fake double array will then allow me to read and write to the arbitrary compressed pointer address provided by its elements.

fakeObj[0] = i32tof(doubleArrayMapAddr, doubleArrayMapAddr);
fakeObj[1] = i32tof(fakeElementStoreAddr, fakeLength);

let fakeArray = lObjArr[0];
var x = fakeArray[0];  //<-------- reads from fakeElementStoreAddr and return as double
fakeArray[0] = y;      //<-------- writes to fakeElementStoreAddr

From this point on, the exploit is fairly standard and consists of the following steps:

  1. Create a WebAssembly::Instance object with a WebAssembly::Module and use the arbitrary compressed address read primitive to read the pointer to its wasm code. This will be a 64-bit pointer that points to the start of a RWX page that contains code to be executed when the main function of the WebAssembly::Instance object is called
  2. To write to this 64-bit RWX address, create a TypedArray object and use the arbitrary compressed address write primitive to overwrite its backing store pointer (which is also a 64-bit pointer) to point to the address of the wasm RWX page that was obtained in the previous step.
  3. Writing to this TypedArray will now overwrite the code that is going to be executed by the main function. This then leads to arbitrary code execution.

The beginning of the end of wasm RWX?

While the method in the previous section works for the standalone d8, when trying to get code execution on the Linux build of Chrome, I encountered a rather mysterious crash. When writing shell code to the wasm RWX pages, the exploit crashes with SIG_SEGV even though the wasm pages are marked as RWX. It turns out that, on Linux and ChromeOS builds of Chrome, a feature kWebAssemblyCodeProtectionPku is currently on field trial, which means that the wasm-memory-protection-keys flag is switched on to write-protect wasm memory. On hardware that supports memory protection keys, memory protection keys will be used to write-protect the memory at a hardware level, which causes a fault even if the memory is marked RWX. While there are other ways bypass this and to get RWX memory, such as overwriting the write_protect_code_memory_ member in the heap structure (See “Trick #6: Disabling W^X at Runtime” in Exploiting CVE-2021-21225 and disabling W^X), the easiest way here seems to be to just overwrite the FLAG_wasm_memory_protection_keys that is used to protect wasm memory. As the flag is located at a fixed offset from libchrome.so, I can easily compute its address if I can leak another fixed object in libchrome.so to find the library base. This is fairly easy. In the Chrome renderer, blink objects are accessed from JavaScript via a wrapper object, which contains a pointer to a WrapperTypeInfo that is at fixed offset from the library base. For example, the OfflineAudioContext blink object contains a pointer to OfflineAudioContext::wrapper_type_info:

//%DebugPrint(new OfflineAudioContext(1,4000,400))
Thread 1 "chrome" hit Breakpoint 1, v8::internal::DebugPrintImpl (maybe_object=...)
    at ../../v8/src/runtime/runtime-test.cc:850
850       StdoutStream os;
(gdb) p/x maybe_object
$1 = {<v8::internal::TaggedImpl<v8::internal::HeapObjectReferenceType::WEAK, unsigned long>> = {static kIsFull = 0x1, 
    static kCanBeWeak = 0x1, ptr_ = 0x131f080a4ee9}, <No data fields>}
(gdb) p/x 0x131f080a4ee8
$2 = 0x131f080a4ee8
(gdb) x/8x 0x131f080a4ee8
0x131f080a4ee8: 0x08249cd9      0x0800222d      0x0800222d      [0x60e9e648]
0x131f080a4ef8: [0x00005555]      0x003a85a8      0x00001620      0x08002699
(gdb) x/x 0x555560e9e648
0x555560e9e648 <_ZN5blink21V8OfflineAudioContext18wrapper_type_info_E>: 0x00000001

As you can see, the OfflineAudioContext contains the WrapperTypeInfo pointer at offset 0xc. By using the arbitrary read in compressed pointers, I can read this address and use this to find the address of FLAG_wasm_memory_protection_keys. Overwriting this key before compiling wasm code allows me to write to the wasm RWX page and use it to execute code.

The exploit can be found here with some set up notes.

Conclusion

In this post, I’ve analyzed CVE-2021-37975 and have shown how logic bugs in the garbage collector can result in very powerful vulnerabilities. The fact that weak references and ephemerons require special treatments in the garbage collector creates extra complexity and edge cases, which then result in this bug. I’ve also looked into some ways to exploit garbage collected objects and shown that, even with the uncertainties introduced by garbage collection, it is still possible to exploit these bugs reliably. I hope these findings will encourage other researchers to study and audit the garbage collector and help secure this often overlooked part of v8.