Coordinated Disclosure Timeline

Summary

A crafted markdown document can trigger an out-of-bounds read in cmark-gfm.

Product

cmark-gfm

Tested Version

0.29.0.gfm.6

Details

Issue: out-of-bounds read in validate_protocol (GHSL-2022-118)

The function validate_protocol contains two array accesses, at autolink.c:277 and autolink.c:282, that are not properly bounds-checked:

static bool validate_protocol(char protocol[], uint8_t *data, int rewind) {
  size_t len = strlen(protocol);

  // Check that the protocol matches
  for (int i = 1; i <= len; i++) {
    if (data[-rewind - i] != protocol[len - i]) {    <==== data[-rewind - i] could be out-of-bounds
      return false;
    }
  }

  char prev_char = data[-rewind - len - 1];    <==== data[-rewind - len - 1] could be out-of-bounds

  // Make sure the character before the protocol is non-alphanumeric
  return !cmark_isalnum(prev_char);
}

The out-of-bounds access can be triggered like this:

echo "to:kev@example.com" | ./src/cmark-gfm -e autolink

To observe the error, you will need to build cmark-gfm with ASAN enabled, as follows:

mkdir build-asan
cd build-asan
cmake -DCMAKE_C_FLAGS='-fsanitize=address' -DCMAKE_EXE_LINKER_FLAGS='-fsanitize=address' -DCMAKE_C_COMPILER=$(which clang) -DCMAKE_CXX_COMPILER=$(which clang++) -DCMAKE_BUILD_TYPE=Debug ..
make
echo "to:kev@example.com" | ./src/cmark-gfm -e autolink

Impact

We believe this bug is harmless in practice, because the out-of-bounds read accesses malloc metadata without any causing any visible damage.

CVE

Credit

This issue was discovered and reported by GHSL team member @kevinbackhouse (Kevin Backhouse).

Contact

You can contact the GHSL team at securitylab@github.com, please include a reference to GHSL-2022-118 in any communication regarding this issue.