"Let's mangle it!"

Last summer, I found an infinite loop vulnerability in the packet-mangler component of Apple’s macOS operating system kernel and reported it to Apple on July 14, 2017. More recently, I found a stack buffer overflow vulnerability in the same block of code and reported it to Apple on February 14, 2018. Both vulnerabilities were fixed in macOS High Sierra 10.13.5, which was released on June 1, 2018. This blog post is about how I found the infinite loop vulnerability last summer using CodeQL. I have also included my proof-of-concept exploit, which triggered both bugs.

Severity and mitigation

There are two vulnerabilities in the same block of code. The first enables a remote attacker to trigger an infinite loop in the kernel. The effect of this is to knock the Mac offline and hog one of its CPU cores. (No permanent damage is done, but a reboot is required to restore internet connectivity.) The second is a stack buffer overflow, which potentially enables a remote attacker to run arbitrary code on the Mac with kernel privileges. To trigger either of these bugs, the attacker just needs to know the IP address of the Mac, so that they can send a malicious network packet to it.

The vulnerability is in the packet-mangler component of the Darwin XNU operating system kernel. The good news is that this component is usually disabled, so most Mac users are safe.

To check whether your Mac is vulnerable, run the following command in a terminal:

netstat | grep packet-mangler

If you get one line of output, then you are safe. If you get two lines then your Mac is vulnerable. The two lines look like this:

       4        9        1     8192     2048 com.apple.packet-mangler
kctl       0      0      1      4 com.apple.packet-mangler

The second line is the problem. It indicates that the packet-mangler is enabled. If you discover that the packet-mangler is enabled on your Mac, then please contact me (@kevin_backhouse), because I am very interested to know which software uses it.

The vulnerability

The bugs are in the function pktmnglr_ipfilter_input. The function is 200 lines long, so I don’t think it would be helpful to reproduce it all here. Instead, I recommend opening it in another tab so that you can see the lines of code that I am referring to. The bugs are in the block of code starting at line 944 and ending at line 1009.

It seems that the purpose of pktmnglr_ipfilter_input is to parse incoming network packets, and sometimes either modify them or drop them altogether. Clearly, any data that comes from an incoming network packet could be malicious and needs to be treated with extreme care. The parameter named data contains the contents of the incoming network packet. The local variables named ip, tcp, and tcp_opt_buf are all initialized with data from the network packet, on lines 866, 912, and 951, respectively. Since the values of those variables can be controlled by an attacker, they should be treated as potentially malicious.

The first problem is that the loop on line 958 keeps looping until tcp_optlen is zero. However, tcp_optlen is not a simple decrementing counter: it is updated with a compound assignment on lines 966 and 993. This means that there is no particular reason why tcp_optlen should ever become zero: it could either get stuck before it reaches zero or it could overshoot and become negative. My proof-of-concept exploit triggers an infinite loop by ensuring that tcp_opt_buf[i+1] == 0 on line 966. It does this is by setting tcp_opt_buf[0] == 2 and tcp_opt_buf[1] == 0. The compound assignment on line 993 could be exploited in a similar way.

The second problem is a negative integer overflow on line 951. If the value of tcp.th_off is zero on line 946 then the value of tcp_optlen will be negative. This triggers a negative integer overflow, because the third argument of mbuf_copydata is a size_t. Since tcp_opt_buf is stack allocated, mbuf_copydata will overwrite the stack with the entire contents of the packet. The actual size of the packet is always far smaller than len, so mbuf_copydata will return an error code due to the packet not containing enough data. This means that the function will bail out immediately on line 954. Therefore, an attacker can do some return-oriented programming by sending a packet which overwrites the return address on the stack.

Finding vulnerabilities with CodeQL

As there are multiple problems with this code, there are also multiple queries that we could write to find them. The query that first led me to look at the code in pktmnglr_ipfilter_input was something like this:

import cpp
import semmle.code.cpp.rangeanalysis.SimpleRangeAnalysis

// Find an assignment like this:  x[i+j] = v
from ArrayExpr ae, BinaryArithmeticOperation idx, Assignment assign
where ae = assign.getLValue()
  and idx = ae.getArrayOffset()
  and convertedExprMightOverflow(idx)
select idx, "Array index might overflow"

This query looks for an array access on the left hand side of an assignment, such as x[i+j] = v. I wrote it because I was looking for array accesses where there is a possibility of an integer overflow in the calculation of the array index. (I used convertedExprMightOverflow from the SimpleRangeAnalysis CodeQL library to determine whether the array index might overflow.) Unlike the queries that we run on the CodeQL query set, this is not a polished query. However, since it only has 66 results, it doesn’t take long to browse through the results to see if any of them look interesting. The great thing about CodeQL is that you can write a disposable query like this in minutes and quickly find interesting parts of the codebase to inspect. The interesting result that this query finds is line 990. Looking at this code, it was not at all obvious to me why i+j should be inside the bounds of tcp_opt_buf. I looked more closely and started to notice other things: for example, on line 971 mptcpoptlen is initialized with a value that is read directly from the packet without any kind of bounds check.

Often I find that bad code gives me inspiration for new queries. In this case, one of the main bugs in pktmnglr_ipfilter_input is the loop on line 958 which doesn’t always terminate. The following simple query looks for that pattern:

import cpp

// Find loops like this:
// while (x) { ...; x -= n; }
from Loop loop, Variable v, AssignArithmeticOperation assign
where loop.getCondition() = v.getAnAccess()
  and assign.getLValue() = v.getAnAccess()
  // Compound assignment is in the body of the loop:
  and assign = loop.getStmt().getAChild*()
select loop, loop.getFile().getRelativePath()

This query finds 53 results. Apart from the loop in pktmnglr_ipfilter_input, none of them look like bugs to me. But this is a query that I can reuse when I look at other codebases.

Proof-of-concept exploit

The packet-mangler is usually not enabled by default, so first you need to switch it on. Apple provide sample code to do this, which you can download as follows:

curl https://opensource.apple.com/source/network_cmds/network_cmds-543/pktmnglr/packet_mangler.c -O
curl https://opensource.apple.com/source/xnu/xnu-4570.41.2/bsd/net/packet_mangler.h -O

A small edit is required to packet_mangler.c to make it compile. Replace this line:

#include <net/packet_mangler.h>

with this:

#define PRIVATE
#include "packet_mangler.h"

Now compile and run it as follows:

cc packet_mangler.c
sudo ./a.out -p tcp -M 1

You can confirm that the packet-mangler is enabled now by running netstat. Killing the a.out process will switch the packet-mangler back off, so your Mac will only be temporarily vulnerable to attack while you run this command.

The proof-of-concept exploit is designed to be compiled and run on a Linux machine:

curl https://lgtm.com/blog/static/apple_xnu_packet_mangler_CVE-2017-13904/cve-2017-13904-poc.c -O
gcc cve-2017-13904-poc.c

Suppose the IP address of the Linux machine is 10.0.0.5 and the IP address of the Mac is 10.0.0.6. Then you can trigger the infinite loop like this:

sudo ./a.out 10.0.0.5 10.0.0.6 infinite

and the stack buffer overflow like this:

sudo ./a.out 10.0.0.5 10.0.0.6 smashstack

The PoC uses the raw socket interface to create a packet with a deliberately malformed TCP header. The code is mostly derived from this blog post by Silver Moon, which explains how to use raw TCP sockets in C.

Invitation to security researchers

The technology that powers this analysis is CodeQL.

Timeline

The timeline below requires a bit of explanation. In July 2017, I found the infinite loop vulnerability in the packet-mangler and reported it to Apple. A month later, I received a formal acknowledgment from product-security@apple.com saying that they would address the issue in a future update. In February 2018, I contacted product-security@apple.com again to ask when the vulnerability would be fixed. A few days later, I received a reply from Apple which said that the bug had already been fixed in macOS version 10.13.2 (December 6, 2017). They apologized for forgetting to acknowledge me, retroactively assigned CVE-2017-13904 to the vulnerability, and updated their December security bulletin. I was puzzled by this because I could not see any recent code changes in the relevant function. Upon further investigation, I discovered that my proof-of-concept exploit (from July 2017) had indeed stopped working. I had originally written it in a bit of hurry, and it turned out that it triggered a slightly different code path than I thought. (I don’t have any tools for debugging the kernel, so I did not realize what it was really doing.) Its actual behavior depended on data that had been left on the stack by a previous function call, so it was non-deterministic and unreliable. I started working on a new, more reliable proof-of-concept exploit. While I was doing this, I also discovered the stack buffer overflow vulnerability, which I had not noticed when I originally reported the infinite loop vulnerability. So I added a second mode to the proof-of-concept exploit and sent it to product-security@apple.com on February 14, 2018. Apple decided to treat the new PoC as a new vulnerability and informed me that the fix would probably be released on May 25, 2018. On June 1, 2018, they released a patch and assigned CVE-2018-4249 to the vulnerabilities.

Note: Post originally published on LGTM.com on June 01, 2018