May 12, 2020

Hot lava: A case study in hunting for network integer arithmetic flaws

Bas Alberts

Most traditional code audits generally follow these steps:

  1. Enumerate all attacker-controlled input
  2. Enumerate all program logic influenced by attacker-controlled input
  3. Enumerate all flaws that may result from said attacker influence
  4. Establish which of those flaws may lead to exploitable conditions and warrant security triage

In the practical sense, this involves tracing controlled data through various APIs and evaluating their impact on program state. In a more abstract sense, you can consider all influence on a target process as "attacker-controlled input". Any influence an attacker has on program state may result in vulnerabilities. This can be as straightforward as "my large string goes out of bounds and overwrites process memory" to something as esoteric as "I can flip bits in the system memory that this process is using because physics."

As you spend more time auditing code, you tend to repeatedly run into the same vulnerability patterns. Over time you develop a fairly reliable intuition for the kind of code constructs that tend to lead to certain types of vulnerabilities. Whenever you look at network protocol code in a memory unsafe language such as C, one such common construct is remote controlled integer values that influence how subsequent remote data is processed. Such [length][value] constructs are common to almost all protocols in which data is transferred.

Whenever we're discussing sizes and lengths, we're generally talking about dealing with integer variables at the implementation level. Even experienced programmers tend to have a hard time with integer overflow, promotion, and truncation in C. Integer arithmetic that seems logically sound at first sight often doesn't account for edge cases that may introduce undesired behavior. We've seen these types of integer arithmetic-related issues creep into all sorts of high profile projects over the years, many of which result in exploitable vulnerabilities.

Whenever integer arithmetic occurs with remotely supplied integers, it's a good idea to slow down and perform a careful audit of the resulting logic. This isn't limited to network protocols and you should always think defensively about any protocol in which length values are supplied from a data source that may be controlled or influenced by a malicious party.

In this post we'll examine a practical example of such flawed integer arithmetic in the ntop Deep Packet Inspection toolkit (nDPI). The ntop Deep Packet Inspection Toolkit is driven in large part by the nDPI library. This library contains a large set of network protocol dissectors intended to parse and analyze packet-captured network traffic.

The nDPI SSH protocol dissector suffered from multiple integer overflow vulnerabilities. These vulnerabilities resulted in a controlled remote heap overflow. Due to the granular nature of the overflow primitive and the ability to control both the contents and layout of the nDPI library's heap memory through remote input, this vulnerability could potentially be abused to achieve full Remote Code Execution (RCE). This would affect any network inspection stack that was linked against vulnerable versions of nDPI.

Additionally, the nDPI SSH and Postgres protocol dissectors suffered from Out Of Bounds (OOB) read vulnerabilities, which could have resulted in a Denial of Service (DoS).

These issues were reported to the project maintainers as GHSL-2020-051 and GHSL-2020-052 and resolved.

Multiple integer overflows in ssh.c:concat_hash_string (GHSL-2020-051)

Note that this annotated source code is based on this version of nDPI's ssh.c.

When dissecting the SSH protocol the nDPI library actively parses KEXINIT (type: 20) messages in both the server-to-client and client-to-server directions. It also attempts to extract various descriptive string sets from SSH KEXINIT packets. These string sets include information such as the key exchange algorithms supported by the client and the server.

The SSH protocol handles string data based on the common [length][data] format, where length is a 32bit integer value. As such, nDPI extracts these length values, and uses them to pull the actual string data from the SSH KEXINIT packets. nDPI also uses these length values to calculate and maintain a running offset in the captured packet data. This offset is maintained as a 16bit unsigned integer variable.

For example, to pull the key exchange algorithms out of a SSH KEXINIT packet, nDPI performs the following operations:

ssh.c:ndpi_search_ssh_tcp
...
    if(msgcode == 20 /* key exchange init */) {
      char *hassh_buf = calloc(packet->payload_packet_len, sizeof(char));
...
          len = concat_hash_string(packet, hassh_buf, 0 /* server */);
...

ssh.c:concat_hash_string
...
  u_int16_t offset = 22, buf_out_len = 0;

  if(offset+sizeof(u_int32_t) >= packet->payload_packet_len)
    goto invalid_payload;

  u_int32_t len = ntohl(*(u_int32_t*)&packet->payload[offset]);
  offset += 4;

  /* -1 for ';' */
  if((offset >= packet->payload_packet_len) || (len >= packet->payload_packet_len-offset-1))
    goto invalid_payload;

  /* ssh.kex_algorithms [C/S] */
  strncpy(buf, (const char *)&packet->payload[offset], buf_out_len = len);
  buf[buf_out_len++] = ';';
  offset += len;

We initially make two observations:

  1. The destination buffer buf was previously allocated with a calloc call that is sized according to packet->payload_packet_len, which represents the actual size of the captured SSH packet.

  2. nDPI attempts to make sure that the offset and subsequently the len variables do not result in data accesses beyond packet->payload_packet_len. The intent here is to prevent read or write access outside of the bounds of the allocated buf memory region.

However, when examining the concat_hash_string function in greater detail, we note the following pattern:

ssh.c:concat_hash_string
...
[1]
  /* ssh.encryption_algorithms_client_to_server [C] */
  len = ntohl(*(u_int32_t*)&packet->payload[offset]);

  if(client_hash) {
    offset += 4;

    if((offset >= packet->payload_packet_len) || (len >= packet->payload_packet_len-offset-1))
      goto invalid_payload;

    strncpy(&buf[buf_out_len], (const char *)&packet->payload[offset], len);
    buf_out_len += len;
    buf[buf_out_len++] = ';';
    offset += len;
  } else
[2]
    offset += 4 + len;

  /* ssh.encryption_algorithms_server_to_client [S] */
  len = ntohl(*(u_int32_t*)&packet->payload[offset]);

  if(!client_hash) {
    offset += 4;

    if((offset >= packet->payload_packet_len) || (len >= packet->payload_packet_len-offset-1))
      goto invalid_payload;

    strncpy(&buf[buf_out_len], (const char *)&packet->payload[offset], len);
    buf_out_len += len;
    buf[buf_out_len++] = ';';
    offset += len;
  } else
    offset += 4 + len;

[3]
  /* ssh.mac_algorithms_client_to_server [C] */
  len = ntohl(*(u_int32_t*)&packet->payload[offset]);

  if(client_hash) {
    offset += 4;

[4]
    if((offset >= packet->payload_packet_len) || (len >= packet->payload_packet_len-offset-1))
      goto invalid_payload;
[5]
    strncpy(&buf[buf_out_len], (const char *)&packet->payload[offset], len);
    buf_out_len += len;
    buf[buf_out_len++] = ';';
    offset += len;
  } else
[6]
    offset += 4 + len;

The client_hash variable decides whether the packet direction is server-to-client or client-to-server, but since the parsing pattern is the same for either direction it doesn't matter for the sake of our discussion. For convenience, we'll examine the !client_hash case, but the same logic is true for the inverse.

At [1] we see that we have full control over a 32bit unsigned length integer. At [2] we see that offset is updated according to this controlled length variable based on an integer addition operation (offset + 4 + len). This arithmetic expression expands to 32bit integer values and it's then truncated back to a 16bit integer value on the final assignment back into offset. This means that, given we have full control of the len variable, we can effectively integer wrap offset to be any value we want.

For example, to make offset become n where n is any desired 16bit integer value, we would simply set len to 0 - offset - 4 + n.

The practical implication here is that, given at [4] the offset variable is used to ensure no OOB accesses occur, and given our ability to effectively set (and RESET) offset to any desired 16bit integer value, we can pass the intended bounds checks by resetting offset to a small enough value.

This integer overflow is the core issue resulting in what ultimately becomes a controlled remote heap overflow. However, to mold this scenario into a RCE-viable situation, some additional context is required.

We've established that we can effectively reset offset in-between the strncpy operations that copy remote controlled data into buf. When we examine the copy operation at [5], we note that destination offset into buf is controlled by buf_out_len. We also note that buf_out_len is adjusted upward based on the len variable, which is the length of our remote controlled string.

For practical exploitation of this issue, the fact that buf_out_len is maintained as its own independent offset into the destination buffer becomes relevant.

Let's recap:

  • We control offset through an integer overflow
  • We can repeat a pattern of: strncpy(&buf[buf_out_len], controlled_data, controlled_len), adjust buf_out_len up by controlled_len, reset offset to any desired n
  • We can not directly overflow buf based on controlled_len due to the controlled_len >= packet->payload_packet_len-offset-1 check

Because packet->payload_packet_len controls the buf memory allocation size, and is directly influenced by our packet size, the intuition that we can simply pack repeated large strings to trigger an offset overwrap here is less than ideal from an attacker perspective. However, what we CAN do is pack a single string into the KEXINIT SSH packet and then copy that string repeatedly. A single string will result in a calloc based on the string's size + some minor SSH protocol overhead. Since buf_out_len is not bounds checked at any point, we can reset the offset variable to point at that initial string data for each subsequent strncpy operation.

In other words, using the offset integer overflow primitive, we can trick the nDPI SSH dissector into repeatedly copying the same data into &buf[buf_out_len]. Since buf_out_len is adjusted based on our controlled string length, and not otherwise sanity checked, we can write out of bounds as early as the second repeat of the strncpy operation, pending string size vs protocol overhead (which you fully control).

This then becomes a controlled remote heap overflow primitive. Since the nDPI library has a direct relationship to remotely controlled input, and it will allocate, deallocate, and populate heap memory in a direct 1:1 relationship to said remotely controlled input, this becomes a viable primitive to achieve remote code execution.

OOB read in ssh.c:concat_hash_string (GHSL-2020-052)

A comparatively minor, but related, repeating issue exists in the form of an OOB read. The following snippet is an example:

ssh.c:concat_hash_string
...
  /* ssh.server_host_key_algorithms [None] */
  len = ntohl(*(u_int32_t*)&packet->payload[offset]);
[1]
  offset += 4 + len;

  /* ssh.encryption_algorithms_client_to_server [C] */
[2]
  len = ntohl(*(u_int32_t*)&packet->payload[offset]);
...

We note that at [1], offset is calculated per the previously discussed integer arithmetic, which is susceptible to integer overflow. However, beyond that, we also note that at [2] the resulting offset value is immediately used as an index into packet data without any bounds check. This may result in an OOB read due to offset being fully user controlled.

A very similar pattern is repeated any time a new offset is calculated in a path that doesn't perform an explicit check against the resulting offset value, for example:

  /* ssh.encryption_algorithms_client_to_server [C] */
  len = ntohl(*(u_int32_t*)&packet->payload[offset]);

  if(client_hash) {
    offset += 4;

    if((offset >= packet->payload_packet_len) || (len >= packet->payload_packet_len-offset-1))
      goto invalid_payload;

    strncpy(&buf[buf_out_len], (const char *)&packet->payload[offset], len);
    buf_out_len += len;
    buf[buf_out_len++] = ';';
    offset += len;
  } else
[1]
    offset += 4 + len;

  /* ssh.encryption_algorithms_server_to_client [S] */
[2]
  len = ntohl(*(u_int32_t*)&packet->payload[offset]);

Again, at [1] the initial remote controlled len variable is used to set offset, immediately after at [2] the resulting offset is used as an index into packet data without any bounds check. This may result in an OOB read due to offset being fully user controlled.

Verification

We can verify these issues by crafting a malicious KEXINIT SSH message and sending it along a network path that is monitored by the nDPI library. In this case we altered the Paramiko SSH library to send a modified KEXINIT message that triggers the aforementioned integer overwrap and subsequent memory corruption. As apparent in the following debugging session, we're able to trigger these vulnerabilities based on malformed remote input.

# nDPI/example › gdb ./ndpiReader
...
(gdb) break concat_hash_string
Breakpoint 1 at 0x431c00: file protocols/ssh.c, line 99.
(gdb) r -i wlp1s0
Starting program: /home/bas/repos/nDPI/example/ndpiReader -i wlp1s0
...
Thread 2 "ndpiReader" hit Breakpoint 1, concat_hash_string (
    packet=0x7ffff0046130, buf=buf@entry=0x7ffff0047340 "",
    client_hash=client_hash@entry=0 '\000') at protocols/ssh.c:99
99    if(offset+sizeof(u_int32_t) >= packet->payload_packet_len)
Missing separate debuginfos, use: dnf debuginfo-install libgcc-9.2.1-1.fc31.x86_64 libpcap-1.9.1-2.fc31.x86_64 libstdc++-9.2.1-1.fc31.x86_64
(gdb) p buf
$1 = 0x7ffff0047340 ""
(gdb) p packet->payload_packet_len
$2 = 1064
(gdb) break strncpy
Breakpoint 2 at 0x7ffff7b69110
(gdb) c
Continuing.

Thread 2 "ndpiReader" hit Breakpoint 2, 0x00007ffff7b69110 in __strncpy_avx2
    () from /lib64/libc.so.6
(gdb) finish
Run till exit from #0  0x00007ffff7b69110 in __strncpy_avx2 ()
   from /lib64/libc.so.6
concat_hash_string (packet=0x7ffff0046130,
    buf=buf@entry=0x7ffff0047340 'A' <repeats 200 times>...,
    client_hash=client_hash@entry=0 '\000') at protocols/ssh.c:110
110   buf[buf_out_len++] = ';';
(gdb) c
Continuing.

Thread 2 "ndpiReader" hit Breakpoint 2, 0x00007ffff7b69110 in __strncpy_avx2
    () from /lib64/libc.so.6
(gdb) finish
Run till exit from #0  0x00007ffff7b69110 in __strncpy_avx2 ()
   from /lib64/libc.so.6
concat_hash_string (packet=0x7ffff0046130,
    buf=buf@entry=0x7ffff0047340 'A' <repeats 200 times>...,
    client_hash=client_hash@entry=0 '\000') at protocols/ssh.c:144
144     buf[buf_out_len++] = ';';
(gdb) p buf_out_len
$3 = 2050
(gdb) p packet->payload_packet_len
$4 = 1064
(gdb) c
Continuing.
free(): invalid next size (normal)

Thread 2 "ndpiReader" received signal SIGABRT, Aborted.
0x00007ffff7a42625 in raise () from /lib64/libc.so.6
(gdb)

Note that after the second strncpy call, buf_out_len is pointing beyond the bounds of the buf allocated memory region (which is based on packet->payload_packet_len), and the resulting heap memory corruption triggers the failure of a heap integrity check on continuation of the ndpiReader application.

Variant Analysis

We caught the previous cases by performing an overall manual audit focused on remotely supplied integers. But since nDPI contains a large set of network protocol dissectors it makes sense to distill the essence of our audit into a CodeQL query to see if the same pattern repeats in other places that we might've missed in our initial review.

To achieve that goal, we first ask ourselves, what are we looking for? It's important that we clarify, to ourselves and others, the what and why of the things we're looking for. It becomes even more important when translating your audit questions into static analysis queries. The clearer your question, the more effectively you can reduce your false positive rate.

When using CodeQL as an auditing tool, you want your queries to enhance your audit workflow-asking pointed questions that result in a manageable amounts of candidates is key. While you can certainly query all integer arithmetic that occurs, you likely end up with thousands of results, and triaging those down is about as effective as a fully manual audit.

Instead we want to compound the aspects of our findings that, when combined, resulted in a vulnerable scenario. In essence we want to triage the following scenario:

  • An integer was read from network supplied data
  • This integer influenced integer arithmetic
  • This integer influenced memory accesses

We saw that most of the nDPI dissector code references network data through packet->payload. We also saw that it uses the ntohl and ntohs macros to perform network byte ordering operations, as well as nDPI specific integer pull macros such as get_u_int32_t. Considering this, we settle on a CodeQL query that will taint any source that uses one of the previously mentioned macro families to access packet->payload. If any of those sources are used for both integer arithmetic and array access, we consider that source as an attacker-controlled input that warrants manual review.

This hybrid model for using CodeQL, where you don't necessarily expect it to find full vulnerabilities for you, but rather use it to narrow your workload when you've established a certain vulnerable code pattern, is quite effective. We ultimately end up with the following query:

/**
* @name Suspicious packet->payload based integer arithmetic
* @description An arithmetic operation influenced array access is suspicious
* if it uses an integer value that is likely to be network-controlled, and
* may require a closer manual audit.
* @kind problem
* @problem.severity warning
* @id cpp/packet-payload-integer-arithmetic
* @tags audit security
*/

import cpp

import semmle.code.cpp.dataflow.TaintTracking
import semmle.code.cpp.rangeanalysis.SimpleRangeAnalysis

/** A source of an integer value that is likely to come from the network.
 * This is produced by an invocation of a macro of the form `ntoh*` or `get_u_int*_t`,
 * called with `packet->payload` as an argument.
 */

class NetworkMacro extends Macro {
  NetworkMacro() { this.getName().regexpMatch("^ntoh(ll|l|s)") }
}

class NetworkIntegerSource extends Expr {
  NetworkIntegerSource() {
    exists(MacroInvocation mi |
      this = mi.getExpr() and
      mi.getUnexpandedArgument(0).regexpMatch(".*packet->payload.*") |
      // catch all get_u_int*_t(x)
      mi.getMacroName().regexpMatch("^get_u_int(64|32|16|8)_t") and
      // dedup ntoh*(get_u_int*_t(x)) since we'll catch those in the next case
      not mi.getOutermostMacroAccess().getMacro() instanceof NetworkMacro
      or
      // catch all ntoh*(x) ... this will also catch the nested cases
      mi.getMacro() instanceof NetworkMacro
    )
  }
}

class ArithmeticOperation extends Operation {
  ArithmeticOperation() {
    this instanceof UnaryArithmeticOperation or this instanceof BinaryArithmeticOperation
  }
}

class NetworkToArrayAccess extends TaintTracking::Configuration {
  NetworkToArrayAccess() { this = "NetworkToArrayAccess" }

  override predicate isSource(DataFlow::Node source) {
    source.asExpr() instanceof NetworkIntegerSource
  }

  override predicate isSink(DataFlow::Node sink) {
    exists(ArrayExpr ae | sink.asExpr() = ae.getArrayOffset())
  }
}

class NetworkToArithmetic extends TaintTracking::Configuration {
  NetworkToArithmetic() { this = "NetworkToArithmetic" }

  override predicate isSource(DataFlow::Node source) {
       source.asExpr() instanceof NetworkIntegerSource
  }

  override predicate isSink(DataFlow::Node sink) {
    exists (Assignment assign |
        sink.asExpr() = assign.getRValue().(ArithmeticOperation) or
        sink.asExpr() = assign.(AssignArithmeticOperation)
    ) or
    exists(LocalVariable var |
      sink.asExpr() = var.getInitializer().getExpr().(ArithmeticOperation)
    )
  }
}

// find audit candidates based on suspicious network integer use
from NetworkIntegerSource source, Expr sink1, Expr sink2, NetworkToArithmetic config1, NetworkToArrayAccess config2
where config1.hasFlow(DataFlow::exprNode(source), DataFlow::exprNode(sink1))
      // or this if you want integer arithmeric _OR_ array accesses
      and config2.hasFlow(DataFlow::exprNode(source), DataFlow::exprNode(sink2))
select source, "Suspicious use of network integer arithmetic."

When we run this query against nDPI we find the following result set:

Suspicious use of network integer arithmetic.   collectd.c:41:32
Suspicious use of network integer arithmetic.   collectd.c:41:26
Suspicious use of network integer arithmetic.   postgres.c:99:22
Suspicious use of network integer arithmetic.   postgres.c:107:12
Suspicious use of network integer arithmetic.   ssh.c:101:19
Suspicious use of network integer arithmetic.   ssh.c:114:9
Suspicious use of network integer arithmetic.   ssh.c:118:9
Suspicious use of network integer arithmetic.   ssh.c:134:9
Suspicious use of network integer arithmetic.   ssh.c:150:9
Suspicious use of network integer arithmetic.   ssh.c:166:9
Suspicious use of network integer arithmetic.   ssh.c:182:9
Suspicious use of network integer arithmetic.   tls.c:796:24
Suspicious use of network integer arithmetic.   tls.c:796:18
Suspicious use of network integer arithmetic.   tls.c:871:21
Suspicious use of network integer arithmetic.   tls.c:871:15
Suspicious use of network integer arithmetic.   tls.c:948:30
Suspicious use of network integer arithmetic.   tls.c:948:24

We see that it found the entry to the originally reported issues in the ssh dissector, as well as additional situations that match the same conditions. The result set is small enough that we can effectively perform a manual triage, which is exactly what we were hoping for.

After examining the additional findings in greater detail, and dismissing the constructs that appear memory safe, we ultimately hone in on the postgres dissector and note the following code constructs:

postgres.c:ndpi_search_postgres_tcp:
...
if (flow->l4.tcp.postgres_stage == 5 && packet->payload[0] == 'R') {
    if (ntohl(get_u_int32_t(packet->payload, 1)) == packet->payload_packet_len - 1) {
        NDPI_LOG_INFO(ndpi_struct, "found postgres asymmetrically\n");
        ndpi_int_postgres_add_connection(ndpi_struct, flow);
        return;
    }
[1]
    size = (u_int16_t)ntohl(get_u_int32_t(packet->payload, 1)) + 1;
    if (packet->payload[size - 1] == 'S') {
        if ((size + get_u_int32_t(packet->payload, (size + 1))) == packet->payload_packet_len) {
            NDPI_LOG_INFO(ndpi_struct, "found postgres asymmetrically\n");
            ndpi_int_postgres_add_connection(ndpi_struct, flow);
            return;
        }
    }
[2]
    size += get_u_int32_t(packet->payload, (size + 1)) + 1;
    if (packet->payload[size - 1] == 'S') {
        NDPI_LOG_INFO(ndpi_struct, "found postgres asymmetrically\n");
        ndpi_int_postgres_add_connection(ndpi_struct, flow);
        return;
    }
}
...

At [1], and [2] we're able to see a familiar pattern repeat itself. A network supplied integer is used in both integer arithmetic and the result is used to access memory without any further validation. This means that we've uncovered two additional OOB read situations. These were also reported to the project maintainers and resolved.

Iterating security as a community

To help prevent future issues of the same nature, we prepared a pull request into ntop/nDPI that places our audit alert query into .lgtm/cpp-queries/packet-payload-integer-arithmetic.ql. This makes the query available as a custom expansion to the lgtm query pack on lgtm enabled GitHub repositories. Whenever future code gets committed that repeats this exact kind of pattern, lgtm will flag an alert to the maintainers.

Patching the patch

After the initial triage of these issues, and the release of the GHSL advisory, Security Researcher Ronald Huizer contacted the GHSL team to let us know that he spotted an additional integer overflow issue in the original nDPI patch.

INTeresting! Let's take a closer look at some of the defensive changes introduced by the ntop team:

ssh.c:concat_hash_string
...
u_int32_t offset = 22, buf_out_len = 0;
...
  if (len > UINT32_MAX - 4 - offset)
    goto invalid_payload;
  offset += 4 + len;

  if(offset+sizeof(u_int32_t) >= packet->payload_packet_len)
    goto invalid_payload;
...
  if (len > UINT32_MAX - offset)
    goto invalid_payload;
  offset += len;
...

We note that there was a clear effort to ensure that the offset += 4 + len can not wrap beyond UINT32_MAX. So far so good.

But then we arrive at:

  if(offset+sizeof(u_int32_t) >= packet->payload_packet_len)
    goto invalid_payload;

And we have to ask ourselves, can offset be made to wrap due to the addition of sizeof(u_int32_t)? The answer is ... it depends. Consider that sizeof() returns type size_t. On 64bit platforms size_t is generally 64bits, so on these platforms this expression converts to use 64bit integers and the check would function as intended. No integer wrap occurs within the expression thanks to sizeof() returning a 64bit integer.

But on platforms where sizeof() returns a 32bit size_t the expression will stick with 32bit integers. Even though the initial check ensures that we don't go BEYOND UINT32_MAX, we can certainly get it to beUINT32_MAX minus (4...1) and within wrapping range when the sizeof(u_int32_t) (4) is added.

What's the impact of that? Well, essentially it reverts us to almost the same exploitation semantics as the first issue. The ability to RESET offset to a smaller than intended value for the second strncpy operation allows us to run buf_out_len near the end of allocated memory, and subsequently trigger a heap overflow.

Let's walk through the patched code to verify:

static u_int16_t concat_hash_string(struct ndpi_packet_struct *packet,
                   char *buf, u_int8_t client_hash) {
  u_int32_t offset = 22, buf_out_len = 0;
  if(offset+sizeof(u_int32_t) >= packet->payload_packet_len)
    goto invalid_payload;
  u_int32_t len = ntohl(*(u_int32_t*)&packet->payload[offset]);
  offset += 4;

  /* -1 for ';' */
  if((offset >= packet->payload_packet_len) || (len >= packet->payload_packet_len-offset-1))
    goto invalid_payload;

  /* ssh.kex_algorithms [C/S] */
[1]
  strncpy(buf, (const char *)&packet->payload[offset], buf_out_len = len);
  buf[buf_out_len++] = ';';
  offset += len;

  if(offset+sizeof(u_int32_t) >= packet->payload_packet_len)
    goto invalid_payload;
  /* ssh.server_host_key_algorithms [None] */
[2]
  len = ntohl(*(u_int32_t*)&packet->payload[offset]);
  if (len > UINT32_MAX - 4 - offset)
    goto invalid_payload;
  offset += 4 + len;
[3]
  if(offset+sizeof(u_int32_t) >= packet->payload_packet_len)
    goto invalid_payload;
  /* ssh.encryption_algorithms_client_to_server [C] */
[4]
  len = ntohl(*(u_int32_t*)&packet->payload[offset]);
[5]
  offset += 4;
  if(client_hash) {
    if((offset >= packet->payload_packet_len) || (len >= packet->payload_packet_len-offset-1))
      goto invalid_payload;
[6]
    strncpy(&buf[buf_out_len], (const char *)&packet->payload[offset], len);
    buf_out_len += len;
    buf[buf_out_len++] = ';';
  }
  if (len > UINT32_MAX - offset)
    goto invalid_payload;
  offset += len;

REMEMBER: Think of 32bit pointers and integers for everything in this path.

The issue becomes sort of similar to the original logic, just slightly trickier. Imagine we again send a packet with a single string.

At [1] the first string is copied into buf, and buf_out_len is adjusted upwards according the length of the string + 1 (accounting for the ;). At [2], we fully control len and we can make offset become any value up to UINT32_MAX. This means that at [3] we can trigger integer overflow and bypass the check against packet->payload_packet_len, then at [4], we receive a NEW len. At [5] we then trigger the offset integer overflow again and can RESET offset to a smaller value than intended.

Now imagine we had made the algebra at [2] result in an offset value of -1 (0xffffffff). On a 32bit system this would result in len being loaded from payload[-1], which on a little endian system means we control the top three most significant bytes of len, at [5] offset would then become 3.

Now, to recap, buf_out_len is already pointing almost at the end of our allocated space since we sent a single string in a small packet, but we RESET offset to a small value (3), so if we can make our partially controlled len < packet->payload_packet_len-3-1 we pass the intended bounds checks.

We then enter the strncpy at [6], which will trigger a heap overflow, since buf_out_len is already pointing near the end of buf. This potentially recreates our original exploitation scenario, but with less repeatability and control. Since we can't reset offset to index to our original string value here but rather index into the SSH message protocol header, we have to get a bit lucky and ensure that the partially controlled packet data at payload[-1] can pass the required protocol checks to reach this code as well as the length check.

But, if those stars happen to align, this is essentially the same scenario as the original issue, where the ability to RESET offset, combined with the fact that buf_out_len is tracked as an independent index from offset and len, allows for heap overflow as a result of integer overflow.

So what went wrong in the original patch? Well, first of all it's generally better to explicitly state your intent when it comes to integer logic in C code. In this example, 64bit platforms are only ok due to an implicit integer conversion introduced by sizeof()'s return type. This introduces ambiguity of intent.

If we wanted to explicitly check offset for unsigned integer overflow on both 32bit and 64bit platforms we could do something like this:

  if((u_int32_t)(offset+sizeof(u_int32_t)) < offset || (u_int32_t)(offset+sizeof(u_int32_t)) >= packet->payload_packet_len)

Here we are explicitly casting our arithmetic down to the size of the operand we are concerned about overflowing (offset) and ensure the resulting expression uses unsigned integers. We then check if the result of the expression is smaller than our original value, which would imply integer overflow occurred. We can get away with this because unsigned integer overflow is fully defined in the C standard and we can expect it to wrap modulo UINT32_MAX.

However, these kinds of solutions can introduce their own issues, and they can get tricky pretty fast.

For example, signed integer overflow is undefined behavior in the C standard, so you can't rely on it for overflow checks similar to the one we just proposed using unsigned integer arithmetic. In fact, some compilers may merrily optimize away signed overflow checks of this type resulting in completely unexpected logic.

Isn't C fun?

And what about subtraction, division, incrementing, or multiplication? Each requires their own situationally aware checks that are specific to signed integers and unsigned integers. The safest thing to do is to consistently check that your operands are within the value ranges that you expect them to be before using them in any integer arithmetic.

A simpler approach to the patch might then be to never let integer overflow occur in the first place:

  1. Make sure that offset + sizeof(u_int32_t) is not > UINT32_MAX
  2. Make sure that offset + sizeof(u_int32_t) is not >= packet->payload_packet-len

So, let's split this up into two independent checks. One to ensure our operand is within our expected value range, and another that makes sure that our resulting offset is within packet payload bounds.

/* ensure offset is outside of UINT32_MAX wrapping range */
if (offset > UINT32_MAX - sizeof(u_int32_t))
  goto invalid_payload;

/* ensure offset is not beyond packet payload boundaries */
if (offset + sizeof(u_int32_t) >= packet->payload_packet_len)
  goto invalid_payload;

Or as a one-liner:

if (offset > UINT32_MAX - sizeof(u_int32_t) || offset + sizeof(u_int32_t) >= packet->payload_packet_len)
  goto invalid_payload;

So when it comes to handling network sourced integers in C, you generally want to slow down to carefully read and then re-read any proposed code that handles integer arithmetic with full audit focus. Then read it one more time. Even for experienced eyes (cough) it can be easy to skip over an issue at first glance if a patch is not reviewed carefully.

Ultimately the ntop maintainer decided to rework the concat_hash_string logic by removing any ability to get close to UINT32_MAX boundaries in the following commit.

It's a wrap!

As we've demonstrated, a very common network protocol construct can quickly turn into vulnerable code when network supplied integers are treated as anything but hot lava. We also saw that, even with full understanding of the issue, it's still possible to miss the exact same type of vulnerability in the exact same code if you're not considering all platforms that code may run on. With this in mind, it's crucial that all user influenced C integer arithmetic is carefully considered in terms of its integer promotions, integer conversion rank, and the usual arithmetic conversions, along with how they may impact integer overflow and underflow corner cases.