skip to content
Back to GitHub.com
Home Bounties CodeQL Research Advisories Get Involved Events
November 14, 2019

Anatomy of a Coffee Bean (Wireless Vulnerabilities in Linux Kernel)

Nico Waisman
Like a good coffee, it all starts with a good bean, or in this case, it was more like a seed. The analogy, which I promise is going to end soon, stands pretty well. It starts with a beautiful red coffee cherry, that has to be refined by professional hands with the knowledge and skills needed to get the most out of every bean. There I was, on the morning of the 28th of August of 2019 (I know, because Mitre knows), breaking fast with my first cup of Colombian/Brazilian coffee brew when I came across CVE-2019-14814. It was not an isolated bug, it came with two more friends CVE-2019-14815 and CVE-2019-14816. I’m far from a Linux Kernel expert, in fact, if you want to read someone that really knows what they are talking about, read [Attacking the Core](https://www.amazon.com/Guide-Kernel-Exploitation-Attacking-Core/dp/1597494860/ref=sr_1_1?keywords=attacking+the+core&qid=1570713863&sr=8-1) by Enrico Perla y Massimiliano Oldani, but I decided to dig into the vulnerability. ## A crash course on 802.11 framing There are three types of frame packages on the 802.11 layer 2: Data Frame, Control Frame, and Management Frame. For the sake of this vulnerability we are going to concentrate on management frames, whose main functions are to join and leave wireless networks, and to move associations between different access points. Management frames include the Probe, Beacon, Association, Authentication, and Deauthentication subtypes, among others. All of the different management frames have one thing in common, a similar MAC header: ![MAC header](/assets/img/post-images/mac-header.png) The primary elements are: Frame Control, which holds the type and subtype of the Frame, DA, SA, and BSSID are addresses that define the Destination, Source and BBSID correspondingly. We are going to concentrate on the Information Elements (IE), these are variable-length elements, described by an EID, which use a TLD format (Type, Length and Data). The number and type of the IE will vary depending on the frame. ![Information Elements](/assets/img/post-images/IE.png) For example, a Beacon package will have the following fields on the frame body: - Timestamp (Fixed) - Capability info (Fixed) - SSID (IE) - Supported Rates (IE) More information about 802.11 Framing, can be found [here](http://osnet.cs.nchu.edu.tw/powpoint/seminar/802.11/802.11Framing%20in%20Detail.pdf) ## The seed, the bean, or the cherry The original vulnerability is in the Marvell 802.11n driver on the function `mwifiex_set_uap_rates`, which takes the supported rates IE from a beacon and sets them into a `bss_config` structure. This function parses the following IE elements: the standard Supported Rates (`WLAN_EID_SUPP_RATES`) and the Extended Supported Rates (`WLAN_EID_EXT_SUPP_RATES`). The bug class for CVE-2019-14815 and CVE-2019-14816 is the same and is actually simple. An Information Element, received through the network, is obtained through a call to the kernel function, `cfg80211_find_ie`. Then the length field is passed directly to `memcpy` as the length field, without checking if the destination buffer has enough space to hold the length. [Check the code](https://github.com/torvalds/linux/blob/69ae4f6aac1578575126319d3f55550e7e440449/drivers/net/wireless/marvell/mwifiex/uap_cmd.c#L266) ```cpp rate_ie = (void *)cfg80211_find_ie(WLAN_EID_SUPP_RATES, var_pos, len); //[1] if (rate_ie) { memcpy(bss_cfg->rates, rate_ie + 1, rate_ie->len); //[2] rate_len = rate_ie->len; } rate_ie = (void *)cfg80211_find_ie(WLAN_EID_EXT_SUPP_RATES, params->beacon.tail, params->beacon.tail_len); //[3] if (rate_ie) memcpy(bss_cfg->rates + rate_len, rate_ie + 1, rate_ie->len); //[4] ``` As can be seen in this code snippet, there are two bugs next to each other, both call `cfg80211_find_ie` (\[1\], \[3\]) to obtain the rate-related Information Element, received through a Frame Control packet over the network, and store it in `rate_ie`. The `len` field (under user control) is used directly as the third argument of `memcpy` (\[2\], \[4\]) and, as a result, a heap overflow could be triggered. As mentioned before, the interesting attack vector is that the overflow can be triggered remotely through the WiFi network. In this case, it’s triggered when the victim receives a beacon packet and requires no interaction or authentication of any kind. ## Variant analysis As the coffee was getting cold, it was time to make CodeQL do all the hard work for me. One of the things you realize after spending a couple of weeks modeling bug classes, is that many vulnerabilities can be represented as data flow. Most of the time, the data-flow source is an entry point to user-controlled data and the sink could be a vulnerable method or code pattern. ```ql import cpp import semmle.code.cpp.dataflow.TaintTracking class GetIE extends TaintTracking::Configuration { GetIE() { this="cfg80211_find_ie" } override predicate isSource(DataFlow::Node source) { exists( Function sl | source.asExpr().(FunctionCall).getTarget() = sl and sl.hasQualifiedName("cfg80211_find_ie") ) } override predicate isSink(DataFlow::Node sink) { exists(FunctionCall fc | sink.asExpr() = fc.getArgument(2) and fc.getTarget().hasQualifiedName("memcpy") ) } } from Expr sl, Expr mal, GetIE config where config.hasFlow(DataFlow::exprNode(sl) , DataFlow::exprNode(mal) ) select mal, "This 'cfg80211_find_ie' use data from $@.", sl ``` The concept behind the query is very straightforward. I’m overriding the predicates `isSource` and `isSink` to define the two points where the data will flow from and to. In more detail, you can observe in the snippet below that I’m assigning the return value from a call to `cfg80211_find_ie` as the **source** of the data flow. ```ql override predicate isSource(DataFlow::Node source) { exists( Function sl | source.asExpr().(FunctionCall).getTarget() = sl and sl.hasQualifiedName("cfg80211_find_ie") ) } ``` For the **sink**, the function associates the third argument of `memcpy`, the length, with the sink. ```ql override predicate isSource(DataFlow::Node source) { exists( Function sl | source.asExpr().(FunctionCall).getTarget() = sl and sl.hasQualifiedName("cfg80211_find_ie") ) } ``` As a result, our data flow will taint the result `cfg80211_find_ie`, this is not just the address but any part of the content, for example in the case shown before, `rate_ie[1]` will still be tracked by the data flow. Sadly, there were no interesting results from this query... ## Time to brew! CodeQL is not just for variant analysis, it’s also a fantastic tool for exploratory queries. I decided not to stop after `cfg80211_find_ie`, and to explore similar functions where the result returns an IE. I’m sure there are more scientific ways to do that, but if the reader has gone so deep into this article, I assume they’re here to enjoy more CodeQL: ```ql import cpp select any(Function f | f.getName().regexpMatch(".*_ie")) ``` The result was a long series of methods that perform similar actions, if given a byte stream, they will obtain a specific Information Element and return it. ![results](/assets/img/post-images/ie-query.png) Fast forward some hours invested in understanding how those methods were used and triaging potential vulnerabilities, and I found a winner: `ieee80211_bss_get_ie`. The query below is very similar to the one I built for CVE-2019-14815, the only difference is the replacement of the `FunctionCall` name. ```ql import cpp import semmle.code.cpp.dataflow.TaintTracking class GetIE extends TaintTracking::Configuration { GetIE() { this="ieee80211_bss_get_ie" } override predicate isSource(DataFlow::Node source) { exists( Function sl | source.asExpr().(FunctionCall).getTarget() = sl and sl.hasQualifiedName("ieee80211_bss_get_ie") ) } override predicate isSink(DataFlow::Node sink) { exists(FunctionCall fc | sink.asExpr() = fc.getArgument(2) and fc.getTarget().hasQualifiedName("memcpy") ) } } from Expr sl, Expr mal, GetIE config where config.hasFlow(DataFlow::exprNode(sl) , DataFlow::exprNode(mal) ) select mal, "This ieee80211_bss_get_ie use data from $@.", sl ``` ## Finding 1 (CVE-2019-16746) Both of the findings have a very similar code pattern. The pointer `ssdie` is being used to contain the IE returned by a call to `ieee80211_bss_get_ie`, the length (`ssdie[1]`) is copied to `join.ssid_len` and later used on a call to `memcpy` with a fully controlled length. [Check the code](https://github.com/torvalds/linux/blob/master/drivers/net/wireless/st/cw1200/sta.c#L1293). ```cpp if (!conf->ibss_joined) { const u8 *ssidie; rcu_read_lock(); ssidie = ieee80211_bss_get_ie(bss, WLAN_EID_SSID); if (ssidie) { join.ssid_len = ssidie[1]; memcpy(join.ssid, &ssidie[2], join.ssid_len); ``` It is important to notice that `join` is a `wsm_join` structure and the size of the `ssid` field is only 32 bytes. ```cpp /* Specifies the SSID of the IBSS to join or start */ u8 ssid[32]; ``` ## Finding 2 (CVE-2019-17133) The exact same pattern can be found in `wext-sme.c`. The length is taken from the IE without any check and used as a parameter to copy into the `ssid` variable on `memcpy`. [Check the code](https://github.com/torvalds/linux/blob/master/net/wireless/wext-sme.c#L217). ```cpp ie = ieee80211_bss_get_ie(&wdev->current_bss->pub, WLAN_EID_SSID); if (ie) { data->flags = 1; data->length = ie[1]; memcpy(ssid, ie + 2, data->length); ``` ## Vulnerability disclosure timeline - **August, 28th 2019** - CVE-2019-14814 found by huangwen was released with information about the vulnerabilities. - **August, 29th 2019** - Github researcher created a QL query and started triaging the vulnerability. - **August, 30th 2019** - Vulnerabilities found and were submitted to the Linux Kernel security mailing list for review. - **September, 20th 2019** - Linux team mailed the public mailing list (LKML) with a patch fixing some of the issues (nl80211/cfg80211). - **October, 4th 2019** - Linux team provided more fixes into LKML (mac80211/wext-sme).