U-Boot Challenge

Do you want to challenge your vulnerability hunting skills and to quickly learn CodeQL? Your mission, should you choose to accept it, is to find all variants leading to a memcpy attacker controlled overflow. You will do this by utilizing QL, our simple, yet expressive, code query language. To capture the flag, you’ll need to write a query that finds unsafe calls to memcpy using this step by step guide.

Challenge instructions

The goal of this challenge is to find the 13 remote-code-execution vulnerabilities that our security researchers found in the U-Boot loader. The vulnerabilities can be triggered when U-Boot is configured to use the network for fetching the next stage boot resources. MITRE has issued the following CVEs for the 13 vulnerabilities: CVE-2019-14192, CVE-2019-14193, CVE-2019-14194, CVE-2019-14195, CVE-2019-14196, CVE-2019-14197, CVE-2019-14198, CVE-2019-14199, CVE-2019-14200, CVE-2019-14201, CVE-2019-14202, CVE-2019-14203, and CVE-2019-14204.

Through these vulnerabilities an attacker in the same network (or controlling a malicious NFS server) could gain code execution at the U-Boot powered device. The first two occurrences of the vulnerability were plain memcpy overflows with an attacker-controlled size coming from the network packet without any validation. The memcpy function copies n bytes from memory area src to memory area dest. This can be unsafe when the size being parsed is not appropriately validated, allowing an attacker to fully control the data and length being passed through.

U-Boot contains hundreds of calls to memcpy and libc functions that read from the network such as ntohl and ntohs. In this challenge, you will use CodeQL to find those calls. Of course many of those calls are safe, so throughout this challenge you will refine your query to reduce the number of false positives.

Upon completion of the challenge, you will have a query that is able to find many of the vulnerabilities that allow for remote execution of arbitrary code on U-Boot powered devices.

Setup instructions

Instructions for installing CodeQL are included at the end of this document.

Documentation links

If you get stuck, try searching our documentation and blog posts for help and ideas. Below are a few links to help you get started:

https://codeql.github.com/docs/

Challenge

The challenge is split into several steps, each of which contains multiple questions.

Step 0: Finding the definition of memcpy, ntohl, ntohll, and ntohs

  import cpp

  from Function f
  where f.getName() = "strlen"
  select f

Question 0.0: Can you work out what the above query is doing?

Hint: Paste it in the Query Console and run it.

Question 0.1: Modify the query to find the definition of memcpy.

Hint: Queries have a from, where, and select clause. Have a look at this introduction to the QL language.

Question 0.2: ntohl, ntohll, and ntohs can either be functions or macros (depending on the platform where the code is compiled).

As these snapshots for U-Boot were built on Linux, we know they are going to be macros. Write a query to find the definition of these macros.

Hint: The CodeQL Query Console has an auto-completion feature. Hit Ctrl-Space after the from clause to get the list of objects you can query. Wait a second after typing myObject. to get the list of methods.
Hint: We can use a regular expression to write a query that searches for all three macros at once.

Step 1: Finding the calls to memcpy, ntohl, ntohll, and ntohs

Question 1.0: Find all the calls to memcpy.

Hint: Use the auto-completion feature on the function call variable to guess how to express the relation between a function call and a function, and how to bind them.

Question 1.1: Find all the calls to ntohl, ntohll, and ntohs.

Hint: calls to ntohl, ntohll, and ntohs are macro invocations, unlike memcpy which is a function call.

Question 1.2: Find the expressions that resulted in these macro invocations.

Hint: We need to get the expression of the macro invocation we found in 1.1

Step 2: Data flow analysis

For this step, we want to detect cases where some data read from the network will end up being used by a call to memcpy. To do this, we’ll use the CodeQL taint tracking library, and its predicate hasFlowPath that will tell us when some data coming from a source flows to a sink. Use the boiler plate provided below to complete your taint tracking query.

Question 2.0: Write a QL class that finds all the top-level expressions associated with the macro invocations to the calls to ntohl, ntohll, and ntohs.

Hint: Querying this class should give you the same results as in question 1.2

Question 2.1: Create the configuration class, by defining the source and sink. The source should be calls to ntohl, ntohll, or ntohs. The sink should be the size argument of an unsafe call to memcpy.

Hint: The source should be an instance of the class you wrote in part 2.0.
Hint: The sink should be the size argument of calls to memcpy.

  /**
    * @kind path-problem
    * @id cpp/ctf/uboot
    */

  import cpp
  import semmle.code.cpp.dataflow.new.TaintTracking

  class YOUR_CLASS_HERE extends Expr {
    // 2.0 Todo 
  }

  module Config implements DataFlow::ConfigSig {
    predicate isSource(DataFlow::Node source) {
      // 2.1 Todo
    }
    predicate isSink(DataFlow::Node sink) {
      // 2.1 Todo
    }
  }

  module Flow = TaintTracking::Global<Config>;
  import Flow::PathGraph

  from Flow::PathNode source, Flow::PathNode sink
  where Flow::flowPath(source, sink)
  select sink, source, sink, "ntoh flows to memcpy"

Step 3: Find additional vulnerabilities

Question 3.0: There are 13 known vulnerabilities in U-Boot.

The query you completed above probably found 9 of them. See if you can refine your query to find 1 or more additional vulnerabilities.

Question 3.1: Generalize your query to find other untrusted inputs (not only networking) such as ext4 fs.

Getting Help

If you find yourself stuck writing QL or on any part of the CTF and would like some help, drop us an email at ctf@github.com

Setup instructions for running CodeQL offline

We hope you enjoyed this challenge! If you are interested in continuing to use CodeQL for security research, then we recommend installing CodeQL on your own computer. This will enable you to run queries offline. We have also provided these offline instructions for posterity, because the query results will change over time as the source code evolves. But the instructions below use a snapshot corresponding to revision d0d07ba, which is the revision for which we designed this challenge. To run CodeQL queries offline, follow these steps:

Install the Visual Studio Code IDE.
Download and install the Visual Studio Code extension.
Download a pre-existing vulnerable uboot CodeQL database or create one by using the CodeQL CLI, which corresponds to revision d0d07ba and import it in Visual Studio Code.