Like a good coffee, it all starts with a good bean, or in this case, it was more like a seed. The analogy, which I promise is going to end soon, stands pretty well. It starts with a beautiful red coffee cherry, that has to be refined by professional hands with the knowledge and skills needed to get the most out of every bean.
There I was, on the morning of the 28th of August of 2019 (I know, because Mitre knows), breaking fast with my first cup of Colombian/Brazilian coffee brew when I came across CVE-2019-14814. It was not an isolated bug, it came with two more friends CVE-2019-14815 and CVE-2019-14816. I’m far from a Linux Kernel expert, in fact, if you want to read someone that really knows what they are talking about, read Attacking the Core by Enrico Perla y Massimiliano Oldani, but I decided to dig into the vulnerability.
A crash course on 802.11 framing
There are three types of frame packages on the 802.11 layer 2: Data Frame, Control Frame, and Management Frame. For the sake of this vulnerability we are going to concentrate on management frames, whose main functions are to join and leave wireless networks, and to move associations between different access points. Management frames include the Probe, Beacon, Association, Authentication, and Deauthentication subtypes, among others.
All of the different management frames have one thing in common, a similar MAC header:
The primary elements are: Frame Control, which holds the type and subtype of the Frame, DA, SA, and BSSID are addresses that define the Destination, Source and BBSID correspondingly. We are going to concentrate on the Information Elements (IE), these are variable-length elements, described by an EID, which use a TLD format (Type, Length and Data). The number and type of the IE will vary depending on the frame.
For example, a Beacon package will have the following fields on the frame body:
- Timestamp (Fixed)
- Capability info (Fixed)
- SSID (IE)
- Supported Rates (IE)
More information about 802.11 Framing, can be found here
The seed, the bean, or the cherry
The original vulnerability is in the Marvell 802.11n driver
on the function mwifiex_set_uap_rates
,
which takes the supported rates IE from a beacon
and sets them into a bss_config
structure.
This function parses the following IE elements:
the standard Supported Rates (WLAN_EID_SUPP_RATES
)
and the Extended Supported Rates (WLAN_EID_EXT_SUPP_RATES
).
The bug class for CVE-2019-14815 and CVE-2019-14816
is the same and is actually simple.
An Information Element, received through the network,
is obtained through a call to the kernel function, cfg80211_find_ie
.
Then the length field is passed directly to memcpy
as the length field,
without checking if the destination buffer has enough space to hold the length.
rate_ie = (void *)cfg80211_find_ie(WLAN_EID_SUPP_RATES, var_pos, len); //[1]
if (rate_ie) {
memcpy(bss_cfg->rates, rate_ie + 1, rate_ie->len); //[2]
rate_len = rate_ie->len;
}
rate_ie = (void *)cfg80211_find_ie(WLAN_EID_EXT_SUPP_RATES,
params->beacon.tail,
params->beacon.tail_len); //[3]
if (rate_ie)
memcpy(bss_cfg->rates + rate_len, rate_ie + 1, rate_ie->len); //[4]
As can be seen in this code snippet, there are two bugs next to each other,
both call cfg80211_find_ie
([1], [3]) to obtain the rate-related Information Element,
received through a Frame Control packet over the network, and store it in rate_ie
.
The len
field (under user control) is used directly as the third argument of memcpy
([2], [4])
and, as a result, a heap overflow could be triggered.
As mentioned before, the interesting attack vector is that the overflow can be triggered remotely through the WiFi network. In this case, it’s triggered when the victim receives a beacon packet and requires no interaction or authentication of any kind.
Variant analysis
As the coffee was getting cold, it was time to make CodeQL do all the hard work for me. One of the things you realize after spending a couple of weeks modeling bug classes, is that many vulnerabilities can be represented as data flow. Most of the time, the data-flow source is an entry point to user-controlled data and the sink could be a vulnerable method or code pattern.
import cpp
import semmle.code.cpp.dataflow.TaintTracking
class GetIE extends TaintTracking::Configuration {
GetIE() { this="cfg80211_find_ie" }
override predicate isSource(DataFlow::Node source) {
exists( Function sl |
source.asExpr().(FunctionCall).getTarget() = sl and
sl.hasQualifiedName("cfg80211_find_ie")
)
}
override predicate isSink(DataFlow::Node sink) {
exists(FunctionCall fc |
sink.asExpr() = fc.getArgument(2) and
fc.getTarget().hasQualifiedName("memcpy")
)
}
}
from Expr sl, Expr mal, GetIE config
where config.hasFlow(DataFlow::exprNode(sl) , DataFlow::exprNode(mal) )
select mal, "This 'cfg80211_find_ie' use data from $@.", sl
The concept behind the query is very straightforward.
I’m overriding the predicates isSource
and isSink
to define the two points where the data will flow from and to.
In more detail, you can observe in the snippet below that
I’m assigning the return value from a call to cfg80211_find_ie
as the source of the data flow.
override predicate isSource(DataFlow::Node source) {
exists( Function sl |
source.asExpr().(FunctionCall).getTarget() = sl and
sl.hasQualifiedName("cfg80211_find_ie")
)
}
For the sink, the function associates the third argument of memcpy
, the length, with the sink.
override predicate isSource(DataFlow::Node source) {
exists( Function sl |
source.asExpr().(FunctionCall).getTarget() = sl and
sl.hasQualifiedName("cfg80211_find_ie")
)
}
As a result, our data flow will taint the result cfg80211_find_ie
,
this is not just the address but any part of the content,
for example in the case shown before, rate_ie[1]
will still be tracked by the data flow.
Sadly, there were no interesting results from this query…
Time to brew!
CodeQL is not just for variant analysis,
it’s also a fantastic tool for exploratory queries.
I decided not to stop after cfg80211_find_ie
,
and to explore similar functions where the result returns an IE.
I’m sure there are more scientific ways to do that,
but if the reader has gone so deep into this article,
I assume they’re here to enjoy more CodeQL:
import cpp
select any(Function f | f.getName().regexpMatch(".*_ie"))
The result was a long series of methods that perform similar actions, if given a byte stream, they will obtain a specific Information Element and return it.
Fast forward some hours invested in understanding how those methods were used
and triaging potential vulnerabilities, and I found a winner: ieee80211_bss_get_ie
.
The query below is very similar to the one I built for CVE-2019-14815,
the only difference is the replacement of the FunctionCall
name.
import cpp
import semmle.code.cpp.dataflow.TaintTracking
class GetIE extends TaintTracking::Configuration {
GetIE() { this="ieee80211_bss_get_ie" }
override predicate isSource(DataFlow::Node source) {
exists( Function sl |
source.asExpr().(FunctionCall).getTarget() = sl and
sl.hasQualifiedName("ieee80211_bss_get_ie")
)
}
override predicate isSink(DataFlow::Node sink) {
exists(FunctionCall fc |
sink.asExpr() = fc.getArgument(2) and
fc.getTarget().hasQualifiedName("memcpy")
)
}
}
from Expr sl, Expr mal, GetIE config
where config.hasFlow(DataFlow::exprNode(sl) , DataFlow::exprNode(mal) )
select mal, "This ieee80211_bss_get_ie use data from $@.", sl
Finding 1 (CVE-2019-16746)
Both of the findings have a very similar code pattern.
The pointer ssdie
is being used to contain the IE
returned by a call to ieee80211_bss_get_ie
,
the length (ssdie[1]
) is copied to join.ssid_len
and later used on a call to memcpy
with a fully controlled length.
Check the code.
if (!conf->ibss_joined) {
const u8 *ssidie;
rcu_read_lock();
ssidie = ieee80211_bss_get_ie(bss, WLAN_EID_SSID);
if (ssidie) {
join.ssid_len = ssidie[1];
memcpy(join.ssid, &ssidie[2], join.ssid_len);
It is important to notice that join
is a wsm_join
structure
and the size of the ssid
field is only 32 bytes.
/* Specifies the SSID of the IBSS to join or start */
u8 ssid[32];
Finding 2 (CVE-2019-17133)
The exact same pattern can be found in wext-sme.c
.
The length is taken from the IE without any check
and used as a parameter to copy into the ssid
variable on memcpy
.
ie = ieee80211_bss_get_ie(&wdev->current_bss->pub,
WLAN_EID_SSID);
if (ie) {
data->flags = 1;
data->length = ie[1];
memcpy(ssid, ie + 2, data->length);
Vulnerability disclosure timeline
- August, 28th 2019 - CVE-2019-14814 found by huangwen was released with information about the vulnerabilities.
- August, 29th 2019 - Github researcher created a QL query and started triaging the vulnerability.
- August, 30th 2019 - Vulnerabilities found and were submitted to the Linux Kernel security mailing list for review.
- September, 20th 2019 - Linux team mailed the public mailing list (LKML) with a patch fixing some of the issues (nl80211/cfg80211).
- October, 4th 2019 - Linux team provided more fixes into LKML (mac80211/wext-sme).