I started a four-part series about Ubuntu’s crash reporting system. In this second post, I’ll focus on apport CVE-2019-7307, a TOCTOU vulnerability that enables a local attacker to include the contents of any file on the system in a crash report.
The bug
Apport allows you to place a file in your home directory named
~/.apport-ignore.xml
.
It enables you to specify a custom
list of executables that should be ignored by the crash reporter.
But what happens if you replace ~/.apport-ignore.xml
with a symlink to
a file that you don’t own, such as /etc/shadow
?
The code that handles that is at
report.py, line 962:
if not os.access(ifpath, os.R_OK) or os.path.getsize(ifpath) == 0:
# create a document from scratch
dom = xml.dom.getDOMImplementation().createDocument(None, 'apport', None)
else:
try:
dom = xml.dom.minidom.parse(ifpath)
except ExpatError as e:
raise ValueError('%s has invalid format: %s' % (_ignore_file, str(e)))
As you can see, it uses os.access
to check that the user has permission
to access the file.
If the permission check passes, then it calls xml.dom.minidom.parse
to
parse the XML.
This is a classic example of a “time of check to time of use” (TOCTOU) vulnerability.
If the file is valid at the time of the os.access
check,
but I quickly replace it with a symlink to a different file before
the call to xml.dom.minidom.parse
,
then I can trick apport into using its elevated privileges to read
a file which I do not have permission to access myself.
Subtleties of privilege dropping in apport
You may wonder why the os.access
check would ever fail,
because apport is a root process.
The reason is that apport drops privileges during its execution in two stages.
The first stage happens at
apport, line 455:
# Partially drop privs to gain proper os.access() checks
drop_privileges(True)
The second stage happens at line 601:
# Totally drop privs before writing out the reportfile.
drop_privileges()
What do they mean by “partially drop privs” and “totally drop privs”? This is related to the real, effective, and saved user ids of the process:
RUID | EUID | SUID | |
---|---|---|---|
root process | 0 | 0 | 0 |
“partially drop privs” | 1001 | 0 | 0 |
read files safely | 1001 | 1001 | 0 |
“totally drop privs” | 1001 | 1001 | 1001 |
(0 is the user id of root
and 1001 is the user id of my
own unprivileged account, kev
.)
The real user id (RUID
) determines the owner of the process,
but the effective user id (EUID
) determines which files
the process can read and write.
This means that when apport is in the “partially drop privs” state, it can still
read any file on the system.
The correct way for apport to make sure that it doesn’t accidentally
use its root privileges to read or write a file is to
first enter the state that I have named “read files safely” in the table.
Because the saved user id (SUID
) is still root, the process can
temporarily enter the “read files safely” state and then
revert back to “partially drop privs” after it’s done reading the
file.
Note that the transition to “totally drop privs” is, in contrast, irreversible.
The os.access
check is unusual because it uses the RUID
, rather
than the EUID
, to check whether the real user has permission to
access the file.
This is the reason why there is a TOCTOU vulnerability.
Apport is in the “partially drop privs” state when os.access
is called.
This means it will reject files that I don’t own,
but if I can bypass the os.access
check then the subsequent call
to xml.dom.minidom.parse
will be able to read any file because
the EUID
is still root.
I can do this by timing the attack to replace
~/.apport-ignore.xml
with a symlink just after the call
to os.access
.
Comparison to CVE-2019-11481
I found a very similar bug at fileutils.py, line 335:
def get_config(section, setting, default=None, path=None, bool=False):
'''Return a setting from user configuration.
This is read from ~/.config/apport/settings or path. If bool is True, the
value is interpreted as a boolean.
'''
if not get_config.config:
get_config.config = ConfigParser()
if path:
get_config.config.read(path)
else:
get_config.config.read(os.path.expanduser(_config_file))
This code opens the file ~/.config/apport/settings
with a root EUID
.
At first glance, since an os.access
check doesn’t exist here, it seems
easier to exploit than the other bug.
After further review, I found that it isn’t, and the reason is due to a difference in
error handling behavior.
For example, if I want to use the bug to read the contents of /var/shadow
,
it’s not a valid XML file, and it also isn’t formatted correctly to
be parsed as an apport settings file.
So, in either case, it will trigger a parse error in apport.
In the case of ~/.config/apport/settings
,
this causes apport to abort immediately.
But in the case of ~/.apport-ignore.xml
, the incorrectly formatted
file is ignored and apport continues running.
Because of this, I found it easier to exploit ~/.apport-ignore.xml
.
I reported the ~/.config/apport/settings
bug to Ubuntu:
bug 1830862.
It’s since been fixed and assigned
CVE-2019-11481.
Exploit plan
The bug enables me to trick apport into loading any file on the system,
by replacing ~/.apport-ignore.xml
with a symlink.
But any file that I’m interested in is almost certainly not going to be a valid XML file,
so it will cause a parse error and apport will ignore it.
How could this help me access forbidden information?
Here’s my cunning plan:
The main idea is that, even though the forbidden file will trigger a parse error and get ignored, it’s still loaded into apport’s heap. This means that if I crash apport then the contents of the file will be included in the crash report. This is the sequence of events in the plan:
- I start
/bin/sleep
and crash it by sending it aSIGSEGV
. - Apport starts up to generate a crash report for
/bin/sleep
. - I replace
~/.apport-ignore.xml
with a symlink at exactly the right moment, so that apport loads a forbidden file into memory. - I crash apport by sending it a
SIGSEGV
. - A second apport starts up to generate a crash report for the first apport.
- The second apport writes out a crash report for the first, containing a copy of the forbidden file in the core dump.
Obstacles
It wasn’t quite that easy. I ran into several problems. The obvious one is that precise timing of the symlink switcheroo is crucial, so I anticipated that being difficult to get right. But there were also some unexpected problems, which I’ll cover in the following sections.
Anti-recursion mitigations
Apport has a couple of mitigations to prevent it from running on itself. The comment at apport, line 30 explains that this is to avoid “bringing down the system to its knees if there is a series of crashes”.
The first mitigation is a lock file named /var/crash/.lock
.
When apport starts, it uses
lockf
to set a lock on this file to prevent
another apport from running at the same time.
The interesting thing is that lockf
file locks are only advisory!
In fact, as Victor Gaydov explains in
this excellent overview,
the lock is actually associated with an [i-node, pid] pair.
This means that if I replace /var/crash/.lock
with a new file
after the first apport has set its lock,
then the second apport will see a different i-node,
so both apports can hold locks on /var/crash/.lock
at the same time!
The trick of replacing /var/crash/.lock
with a new file
relies on me having permission to delete or move the file.
Since the /var/crash
directory has the sticky bit set
(see the
first post for more information),
this means that I must own the file.
Luckily, /var/crash
is world-writable, so I can create
/var/crash/.lock
as long as it doesn’t already exist.
When I first submitted my
bug report
to Ubuntu on May 29,
I thought that this would often make the vulnerability unexploitable.
That’s because on my work laptop,
/var/crash/.lock
almost always exists and is owned by root.
I have since discovered that /var/crash/.lock
is deleted by
a daily cronjob: /etc/cron.daily/apport
.
The lock file often exists on my work laptop because I
deliberately crash applications on a fairly regular basis.
But on a typical Ubuntu system, it is unlikely to exist at
any given time, due to the daily cronjob.
In my
bug report,
I recommended that /var/crash/.lock
should always
exist and be owned by root, as a mitigation against this type of exploit.
While I did not regard it as a vulnerability by itself, Sander Bos has since submitted a separate
bug report
about this issue.
It’s been assigned
CVE-2019-11485
and fixed by changing the directory that the lock file is stored in.
The second mitigation is a slightly obscure bit of
logic in the kernel,
based on
RLIMIT_CORE
.
RLIMIT_CORE
is a resource limit: the maximum size of the
core file.
The value RLIMIT_CORE == 1
is used as a special value to
indicate that the process is a crash reporter and should not
generate a core dump if it crashes (to prevent recursion).
I found an explanation of this mitigation in
this comment.
I got lucky with the RLIMIT_CORE
mitigation.
It turns out that you can use prlimit
to modify the RLIMIT_CORE
of another process!
You need to have appropriate permissions to so do, of course,
but I found that it works as soon as apport enters the
“totally drop privs” state (refer to the table).
Unfortunately, It isn’t possible to increase the value of
RLIMIT_CORE
with prlimit
, but I am able to drop it to zero, which is sufficient for this exploit.
Signal handling
Part of my cunning plan was to crash apport by sending it a SIGSEGV
.
That doesn’t work because apport
sets a signal handler for SIGSEGV
:
def setup_signals():
'''Install a signal handler for all crash-like signals, so that apport is
not called on itself when apport crashed.'''
signal.signal(signal.SIGILL, _log_signal_handler)
signal.signal(signal.SIGABRT, _log_signal_handler)
signal.signal(signal.SIGFPE, _log_signal_handler)
signal.signal(signal.SIGSEGV, _log_signal_handler)
signal.signal(signal.SIGPIPE, _log_signal_handler)
signal.signal(signal.SIGBUS, _log_signal_handler)
Again, it appears that the motivation for this is to prevent apport
from running recursively on itself.
Luckily for me, the list of signals that setup_signals
sets handlers
for isn’t sufficiently thorough.
The section 7 man page for signal
has a table titled “Standard signals”. Here’s a short excerpt:
Signal | Value | Action | Comment |
---|---|---|---|
SIGINT |
2 | Term | Interrupt from keyboard |
SIGQUIT |
3 | Core | Quit from keyboard |
SIGILL |
4 | Core | Illegal Instruction |
… | … | … | … |
Any signal with “Core” in the “Action” column will trigger a core dump.
Apport’s list of signal handlers includes the most common core-generating signals,
but it’s far from comprehensive.
There are several left to choose from.
My exploit uses SIGTRAP
.
Exploit implementation
I’ve posted the source code for my proof-of-concept exploit on GitHub. It works mostly according to the plan that I described above, but with a few tweaks to account for the obstacles discussed above. This is the sequence of events in the revised plan:
- I start a
/bin/sleep
. - I create
/var/crash/.lock
, so that I can delete it later. - I kill
/bin/sleep
with aSIGSEGV
. - Apport starts up to generate a crash report for
/bin/sleep
. - I replace
~/.apport-ignore.xml
with a symlink at exactly the right moment, so that apport loads a forbidden file into memory. - I replace
/var/crash/.lock
with a new file, to bypass the file lock and enable a second apport to run at the same time as the first. - I use
prlimit
to set apport’sRLIMIT_CORE
to zero. - I crash apport by sending it a
SIGTRAP
. - A second apport starts up to generate a crash report for the first apport.
- The second apport writes out a crash report for the first, containing a copy of the forbidden file in the core dump.
All that’s left to discuss is how I time the
symlink switcheroo.
I initially thought it would be very difficult to get the exploit working,
because there is such a short time-interval between the call to os.access
and when the file is opened.
But it turns out that it is hilariously easy to win a race against
Python when you are programming in C.
The crucial moment in the PoC, when the switcheroo happens, is at
line 155.
I use inotify for the timing.
By running sudo strace -e file -tt -p <apport PID>
, I discovered
that a file named expatbuilder.cpython-36.pyc
is always opened
immediately before ~/.apport-ignore.xml
is parsed.
By
watching
for an IN_OPEN
event on that file, I can time the switcheroo very precisely.
You have got to be kidding me!
When I was finally able to get the exploit working, I excitedly went to look at the crash report in /var/crash
and saw the following:
kev@constellation:~$ ls -al /var/crash/
total 4492
drwxrwsrwt 2 root whoopsie 12288 Nov 5 12:26 .
drwxr-xr-x 17 root root 4096 Jul 17 19:31 ..
-rw-r----- 1 root whoopsie 4583201 Nov 5 12:26 _usr_share_apport_apport.0.crash
That was definitely a facepalm moment.
The file is owned by root.
What happened?
I was sure that it would be owned by me,
because my PoC doesn’t send the SIGTRAP
until after
the first apport has entered the “totally drop privs” state
(refer to the table).
The apport process is completely owned by me at the moment
when it crashes, so surely I should be able to read the
crash report?
This problem is caused by a subtle detail in how apport determines
the owner of the crashed process.
This happens in
get_pid_info
,
by running os.stat
on /proc/[pid]/stat
.
This is explained in a couple of comments scattered throughout the source code,
such as
here
and
here.
It’s a mitigation against accidentally leaking sensitive information when a setuid binary crashes
(which is almost exactly what I’m trying to do).
In my case, apport was started as a root process,
so /proc/[pid]/stat
is owned by root,
even after the transition to the “totally drop privs” state.
I haven’t been able to find any way to defeat this protection.
The consolation prize is that the exploit works. When I looked at the contents of the file, this is what I saw:
The other good news is that the exploit is very quick and reliable. I thought that the timing of the symlink switcheroo might make it unreliable, but I found that it works perfectly every time.
So all is not lost. Although the crash report is owned by root, it’s also readable by whoopsie, which means that if I can find a vulnerability in the whoopsie daemon, I might also be able to read the contents of the crash report.
To be continued …
Stay tuned for the next two posts in this series:
- December 19, 2019: Ubuntu apport PID recycling vulnerability (CVE-2019-15790)
- December 23, 2019: Ubuntu whoopsie integer overflow vulnerability (CVE-2019-11484)