This post is about a vulnerability which I found in Apple’s XNU operating system kernel. I have written a proof-of-concept exploit which can reboot any Mac or iOS device on the same network, without any user interaction. Apple have classified this vulnerability as a remote code execution vulnerability in the kernel.

The following operating system versions and devices are vulnerable:

I reported the vulnerability in time for Apple to patch the vulnerability for iOS 12 (released on September 17) and macOS Mojave (released on September 24). Both patches were announced retrospectively on October 30.

Severity and Mitigation

The vulnerability is an out-of-bounds write (CWE-787) in the networking code in the XNU operating system kernel. XNU is used by both iOS and macOS, which is why iPhones, iPads, and Macbooks are all affected. To trigger the vulnerability, an attacker merely needs to send a malicious IP packet to the IP address of the target device. No user interaction is required. The attacker only needs to be connected to the same network as the target device. For example, if you are using the free WiFi in a coffee shop then an attacker can join the same WiFi network and send a malicious packet to your device. (If an attacker is on the same network as you, it is easy for them to discover your device’s IP address using nmap.) To make matters worse, the vulnerability is in such a fundamental part of the networking code that anti-virus software will not protect you: I tested the vulnerability on a Mac running McAfee® Endpoint Security for Mac and it made no difference. It also doesn’t matter what software you are running on the device - the malicious packet will still trigger the vulnerability even if you don’t have any ports open.

The security bulletin released by Apple says: “An attacker in a privileged network position may be able to execute arbitrary code”. I have not attempted to write an exploit which is capable of doing this. My exploit PoC just causes an immediate kernel crash and device reboot. Update: a few days after the vulnerability was disclosed, I received a very nice email from Ian Beer of Google Project Zero. Based on what he told me (more on this below), it seems quite unlikely that is possible to actually achieve RCE from this bug. I guess that Apple still classified it as an RCE because any out-of-bounds write of kernel memory could potentially lead to RCE.

I am only aware of two mitigations against this vulnerability:

  1. Enabling stealth mode in the macOS firewall prevents the attack from working. Kudos to my colleague Henti Smith for discovering this, because this is an obscure system setting which is not enabled by default. As far as I’m aware, stealth mode does not exist on iOS devices.

  2. Do not use public WiFi networks. The attacker needs to be on the same network as the target device. It is not usually possible to send the malicious packet across the internet. For example, I wrote a fake web server which sends back a malicious reply when the target device tries to load a webpage. In my experiments, the malicious packet never arrived, except when the web server was on the same network as the target device.

Proof-of-concept exploit

I have written a proof-of-concept exploit which triggers the vulnerability. To give Apple’s users time to upgrade, I will not publish the source code for the exploit PoC immediately. However, I have made a short video which shows the PoC in action, crashing all the Apple devices on the local network.

The vulnerability

I originally believed that the bug was a buffer overflow in this line of code (bsd/netinet/ip_icmp.c:339):

m_copydata(n, 0, icmplen, (caddr_t)&icp->icmp_ip);

That’s what I told Apple when I reported the bug to them, and they did not dispute my claim (even though they almost certainly knew that I was wrong). However, as I mentioned above, I received an email from Ian Beer a few days after the vulnerability was disclosed. Ian Beer explained that the out-of-bounds write actually happens a few lines earlier (bsd/netinet/ip_icmp.c:339):

icp->icmp_type = type;

This code is in the function icmp_error. According to the comment, the purpose of this function is to “Generate an error packet of type error in response to bad packet ip”. It uses the ICMP protocol to send out the error message. The header of the packet that caused the error is included in the ICMP message, so the purpose of the call to m_copydata on line 339 is to copy the header of the bad packet into the ICMP message. The problem is that the code goes wrong when the header is unusually large. The destination buffer is an mbuf. mbuf is a datatype which is used to store both incoming and outgoing network packets. In this code, n is an incoming packet (containing untrusted data) and m is an outgoing ICMP packet. As we will see shortly, icp is a pointer into m. m is allocated on line 294 or line 296:

if (MHLEN > (sizeof(struct ip) + ICMP_MINLEN + icmplen))
  m = m_gethdr(M_DONTWAIT, MT_HEADER);  /* MAC-OK */
else
  m = m_getcl(M_DONTWAIT, MT_DATA, M_PKTHDR);

Slightly further down, on line 314, mtod is used to get m’s data pointer:

icp = mtod(m, struct icmp *);

mtod is just a macro, so this line of code does not check that the mbuf is large enough to hold an icmp struct. Furthermore, the data is not copied to icp, but to &icp->icmp_ip, which is at an offset of +8 bytes from icp. If icmplen >= 60 then the mbuf is allocated by m_getcl, rather than m_gethdr, but the code does not appear to check anywhere that the allocated mbuf is big enough. However, thanks to Ian Beer, I now understand that it does not matter how big the mbuf allocated by m_getcl is, because there is a negative integer overflow on line 313:

MH_ALIGN(m, m->m_len);

This is the definition of the MH_ALIGN macro:

#define MH_ALIGN(m, len)                                  \
do {                                                      \
  (m)->m_data += (MHLEN - (len)) &~ (sizeof (long) - 1);  \
} while (0)

The value of len here is icmplen + 8. The value of MHLEN is 88, so if icmplen > 80 then a negative integer overflow happens and m->m_data is incremented by just under 4GB. This means that icp gets assigned a bogus data pointer on the next line and the assignment to icp->icmp_type on line 320 causes an out-of-bounds write.

At this time, I will not say any more about how the exploit works. I want to give Apple users a chance to upgrade their devices first. However, in the relatively near future I will publish the source code for the exploit PoC in our SecurityExploits repository.

Finding the vulnerability with CodeQL

I found this vulnerability by doing variant analysis on the bug that caused the buffer overflow vulnerability in the packet-mangler. That vulnerability was caused by a call to mbuf_copydata with a user-controlled size argument. So I wrote a simple query to look for similar bugs:

/**
 * @name mbuf copydata with tainted size
 * @description Calling m_copydata with an untrusted size argument
 *              could cause a buffer overflow.
 * @kind path-problem
 * @problem.severity warning
 * @id apple-xnu/cpp/mbuf-copydata-with-tainted-size
 */

import cpp
import semmle.code.cpp.dataflow.TaintTracking
import DataFlow::PathGraph

class Config extends TaintTracking::Configuration {
  Config() { this = "tcphdr_flow" }

  override predicate isSource(DataFlow::Node source) {
    source.asExpr().(FunctionCall).getTarget().getName() = "m_mtod"
  }

  override predicate isSink(DataFlow::Node sink) {
    exists (FunctionCall call
    | call.getArgument(2) = sink.asExpr() and
      call.getTarget().getName().matches("%copydata"))
  }
}

from Config cfg, DataFlow::PathNode source, DataFlow::PathNode sink
where cfg.hasFlowPath(source, sink)
select sink, source, sink, "m_copydata with tainted size."

This is a simple taint-tracking query which looks for dataflow from m_mtod to the size argument of a “copydata” function. The function named m_mtod returns the data pointer of an mbuf, so it is quite likely that it will return untrusted data. It is what the mtod macro expands to. Obviously m_mtod is just one of many sources of untrusted data in the XNU kernel, but I have not included any other sources to keep the query as simple as possible. This query returns 9 results, the first of which is the vulnerability in icmp_error. I believe the other 8 results are false positives, but the code is sufficiently complicated that I do not consider them to be bad query results.

Try CodeQL on XNU

Unlike most other open source projects, XNU is not available to query on LGTM. This is because LGTM uses Linux workers to build projects, but XNU can only be built on a Mac. Even on a Mac, XNU is highly non-trivial to build. I would not have been able to do it if I had not found this incredibly useful blog post by Jeremy Andrus. Using Jeremy Andrus’s instructions and scripts, I have manually built snapshots for the three most recent published versions of XNU. The versions required are 10.13.4, 10.13.5 and 10.13.6. Unfortunately, at the time of writing, Apple have not yet released the source code for 10.14 (Mojave / iOS 12). To run queries on those versions, you would need to first create the corresponding CodeQL databases by using CodeQL CLI, and then install the CodeQL extension for VSCode or Eclipse to run queries.

[EDIT]: You can also use our free CodeQL extension for Visual Studio Code. See installation instructions at https://securitylab.github.com/tools/codeql/.

Timeline

"Send it back"

Updates

Credits

Note: Post originally published on LGTM.com on October 30, 2018