Like a good coffee, it all starts with a good bean, or in this case, it was more like a seed. The analogy, which I promise is going to end soon, stands pretty well. It starts with a beautiful red coffee cherry, that has to be refined by professional hands with the knowledge and skills needed to get the most out of every bean.

There I was, on the morning of the 28th of August of 2019 (I know, because Mitre knows), breaking fast with my first cup of Colombian/Brazilian coffee brew when I came across CVE-2019-14814. It was not an isolated bug, it came with two more friends CVE-2019-14815 and CVE-2019-14816. I’m far from a Linux Kernel expert, in fact, if you want to read someone that really knows what they are talking about, read Attacking the Core by Enrico Perla y Massimiliano Oldani, but I decided to dig into the vulnerability.

A crash course on 802.11 framing

There are three types of frame packages on the 802.11 layer 2: Data Frame, Control Frame, and Management Frame. For the sake of this vulnerability we are going to concentrate on management frames, whose main functions are to join and leave wireless networks, and to move associations between different access points. Management frames include the Probe, Beacon, Association, Authentication, and Deauthentication subtypes, among others.

All of the different management frames have one thing in common, a similar MAC header:

MAC header

The primary elements are: Frame Control, which holds the type and subtype of the Frame, DA, SA, and BSSID are addresses that define the Destination, Source and BBSID correspondingly. We are going to concentrate on the Information Elements (IE), these are variable-length elements, described by an EID, which use a TLD format (Type, Length and Data). The number and type of the IE will vary depending on the frame.

Information Elements

For example, a Beacon package will have the following fields on the frame body:

More information about 802.11 Framing, can be found here

The seed, the bean, or the cherry

The original vulnerability is in the Marvell 802.11n driver on the function mwifiex_set_uap_rates, which takes the supported rates IE from a beacon and sets them into a bss_config structure. This function parses the following IE elements: the standard Supported Rates (WLAN_EID_SUPP_RATES) and the Extended Supported Rates (WLAN_EID_EXT_SUPP_RATES). The bug class for CVE-2019-14815 and CVE-2019-14816 is the same and is actually simple. An Information Element, received through the network, is obtained through a call to the kernel function, cfg80211_find_ie. Then the length field is passed directly to memcpy as the length field, without checking if the destination buffer has enough space to hold the length.

Check the code

rate_ie = (void *)cfg80211_find_ie(WLAN_EID_SUPP_RATES, var_pos, len); //[1]
if (rate_ie) {
    memcpy(bss_cfg->rates, rate_ie + 1, rate_ie->len); //[2]
    rate_len = rate_ie->len;
}

rate_ie = (void *)cfg80211_find_ie(WLAN_EID_EXT_SUPP_RATES,
                    params->beacon.tail,
                    params->beacon.tail_len); //[3]
if (rate_ie)
    memcpy(bss_cfg->rates + rate_len, rate_ie + 1, rate_ie->len); //[4]

As can be seen in this code snippet, there are two bugs next to each other, both call cfg80211_find_ie ([1], [3]) to obtain the rate-related Information Element, received through a Frame Control packet over the network, and store it in rate_ie. The len field (under user control) is used directly as the third argument of memcpy ([2], [4]) and, as a result, a heap overflow could be triggered.

As mentioned before, the interesting attack vector is that the overflow can be triggered remotely through the WiFi network. In this case, it’s triggered when the victim receives a beacon packet and requires no interaction or authentication of any kind.

Variant analysis

As the coffee was getting cold, it was time to make CodeQL do all the hard work for me. One of the things you realize after spending a couple of weeks modeling bug classes, is that many vulnerabilities can be represented as data flow. Most of the time, the data-flow source is an entry point to user-controlled data and the sink could be a vulnerable method or code pattern.

import cpp
import semmle.code.cpp.dataflow.TaintTracking

class GetIE extends TaintTracking::Configuration {
	GetIE() { this="cfg80211_find_ie" }
	
	override predicate isSource(DataFlow::Node source) {
		exists( Function sl | 
			source.asExpr().(FunctionCall).getTarget() = sl and
			sl.hasQualifiedName("cfg80211_find_ie")
	
		)
	}
	override predicate isSink(DataFlow::Node sink) {
		exists(FunctionCall fc |
		sink.asExpr() = fc.getArgument(2) and
		fc.getTarget().hasQualifiedName("memcpy")
		)
	}	
}
from Expr sl, Expr mal, GetIE config
where config.hasFlow(DataFlow::exprNode(sl) , DataFlow::exprNode(mal) )
select mal, "This 'cfg80211_find_ie' use data from $@.", sl

The concept behind the query is very straightforward. I’m overriding the predicates isSource and isSink to define the two points where the data will flow from and to.

In more detail, you can observe in the snippet below that I’m assigning the return value from a call to cfg80211_find_ie as the source of the data flow.

override predicate isSource(DataFlow::Node source) {
		exists( Function sl | 
			source.asExpr().(FunctionCall).getTarget() = sl and
			sl.hasQualifiedName("cfg80211_find_ie")
	
		)
    }

For the sink, the function associates the third argument of memcpy, the length, with the sink.

override predicate isSource(DataFlow::Node source) {
		exists( Function sl | 
			source.asExpr().(FunctionCall).getTarget() = sl and
			sl.hasQualifiedName("cfg80211_find_ie")
	
		)
    }

As a result, our data flow will taint the result cfg80211_find_ie, this is not just the address but any part of the content, for example in the case shown before, rate_ie[1] will still be tracked by the data flow.

Sadly, there were no interesting results from this query…

Time to brew!

CodeQL is not just for variant analysis, it’s also a fantastic tool for exploratory queries. I decided not to stop after cfg80211_find_ie, and to explore similar functions where the result returns an IE. I’m sure there are more scientific ways to do that, but if the reader has gone so deep into this article, I assume they’re here to enjoy more CodeQL:

import cpp
select any(Function f | f.getName().regexpMatch(".*_ie"))

The result was a long series of methods that perform similar actions, if given a byte stream, they will obtain a specific Information Element and return it.

results

Fast forward some hours invested in understanding how those methods were used and triaging potential vulnerabilities, and I found a winner: ieee80211_bss_get_ie. The query below is very similar to the one I built for CVE-2019-14815, the only difference is the replacement of the FunctionCall name.

import cpp
import semmle.code.cpp.dataflow.TaintTracking

class GetIE extends TaintTracking::Configuration {
	GetIE() { this="ieee80211_bss_get_ie" }
	
	override predicate isSource(DataFlow::Node source) {
		exists( Function sl | 
			source.asExpr().(FunctionCall).getTarget() = sl and
			sl.hasQualifiedName("ieee80211_bss_get_ie")
	
		)
	}
	override predicate isSink(DataFlow::Node sink) {
		exists(FunctionCall fc |
		sink.asExpr() = fc.getArgument(2) and
		fc.getTarget().hasQualifiedName("memcpy")
		)
	}	
}
from Expr sl, Expr mal, GetIE config
where config.hasFlow(DataFlow::exprNode(sl) , DataFlow::exprNode(mal) )
select mal, "This ieee80211_bss_get_ie use data from $@.", sl

Finding 1 (CVE-2019-16746)

Both of the findings have a very similar code pattern. The pointer ssdie is being used to contain the IE returned by a call to ieee80211_bss_get_ie, the length (ssdie[1]) is copied to join.ssid_len and later used on a call to memcpy with a fully controlled length. Check the code.

if (!conf->ibss_joined) {
    const u8 *ssidie;
    rcu_read_lock();
    ssidie = ieee80211_bss_get_ie(bss, WLAN_EID_SSID);
    if (ssidie) {
        join.ssid_len = ssidie[1];
        memcpy(join.ssid, &ssidie[2], join.ssid_len);

It is important to notice that join is a wsm_join structure and the size of the ssid field is only 32 bytes.

/* Specifies the SSID of the IBSS to join or start */
u8 ssid[32];

Finding 2 (CVE-2019-17133)

The exact same pattern can be found in wext-sme.c. The length is taken from the IE without any check and used as a parameter to copy into the ssid variable on memcpy.

Check the code.

ie = ieee80211_bss_get_ie(&wdev->current_bss->pub,
                WLAN_EID_SSID);
if (ie) {
    data->flags = 1;
    data->length = ie[1];
    memcpy(ssid, ie + 2, data->length);

Vulnerability disclosure timeline