Coordinated Disclosure Timeline
- 2024-02-29: Sent an email with the report to the maintainer.
- 2024-03-11: Sent a follow up message asking for an update.
- 2024-04-10: Created an issue asking for a contact person regarding the vulnerability or for enabling private vulnerability reporting (PVR). The maintainer enabled private vulnerability reporting.
- 2024-04-11: Created the reports in PVR.
- 2024-04-11: The maintainer creates fixes for the issues
- 2024-04-12: CVEs are assigned and advisories are published.
Summary
Kohya_ss v22.6.1 is vulnerable to multiple command injections and path injections.
Project
kohya_ss
Tested Version
Details
Issue 1: Command injection in basic_caption_gui.py
(GHSL-2024-019
)
The gradio_basic_caption_gui_tab
function in basic_caption_gui.py
takes multiple user input, images_dir
among others, and uses it as an argument in a command, which is later executed leading to arbitrary command injection.
def caption_images(
caption_text,
images_dir,
overwrite,
caption_ext,
prefix,
postfix,
find_text,
replace_text,
):
# Code cut for readability
run_cmd = f'python "tools/caption.py"'
run_cmd += f' --caption_text="{caption_text}"'
# Add optional flags to the command
if overwrite:
run_cmd += f' --overwrite'
if caption_ext:
run_cmd += f' --caption_file_ext="{caption_ext}"'
run_cmd += f' "{images_dir}"'
log.info(run_cmd)
# Run the command based on the operating system
if os.name == 'posix':
os.system(run_cmd)
else:
subprocess.run(run_cmd)
Issue 2: Command injection in git_caption_gui.py
(GHSL-2024-020
)
The gradio_git_caption_gui_tab
function in git_caption_gui.py
takes multiple user input, train_data_dir
among others, and uses it as an argument in a command, which is later executed leading to arbitrary command injection.
def caption_images(
train_data_dir,
caption_ext,
batch_size,
max_data_loader_n_workers,
max_length,
model_id,
prefix,
postfix,
):
# Code cut for readability
run_cmd = f'{PYTHON} finetune/make_captions_by_git.py'
if not model_id == '':
run_cmd += f' --model_id="{model_id}"'
run_cmd += f' --batch_size="{int(batch_size)}"'
run_cmd += (
f' --max_data_loader_n_workers="{int(max_data_loader_n_workers)}"'
)
run_cmd += f' --max_length="{int(max_length)}"'
if caption_ext != '':
run_cmd += f' --caption_extension="{caption_ext}"'
run_cmd += f' "{train_data_dir}"'
log.info(run_cmd)
# Run the command
if os.name == 'posix':
os.system(run_cmd)
else:
subprocess.run(run_cmd)
Issue 3: Command injection in group_images_gui.py
(GHSL-2024-021
)
The gradio_group_images_gui_tab
function in group_images_gui.py
takes multiple user input, group_size
among others, and uses it as an argument in a command, which is later executed leading to arbitrary command injection.
def group_images(
input_folder,
output_folder,
group_size,
include_subfolders,
do_not_copy_other_files,
generate_captions,
caption_ext,
):
# Code cut for readability
run_cmd = f'{PYTHON} "{os.path.join("tools","group_images.py")}"'
run_cmd += f' "{input_folder}"'
run_cmd += f' "{output_folder}"'
run_cmd += f' {(group_size)}'
if include_subfolders:
run_cmd += f' --include_subfolders'
if do_not_copy_other_files:
run_cmd += f' --do_not_copy_other_files'
if generate_captions:
run_cmd += f' --caption'
if caption_ext:
run_cmd += f' --caption_ext={caption_ext}'
log.info(run_cmd)
if os.name == 'posix':
os.system(run_cmd)
else:
subprocess.run(run_cmd)
Issue 4: command injection in finetune_gui.py
(GHSL-2024-022
)
The finetune_tab
function in finetune_gui.py
takes multiple user input, caption_metadata_filename
among others, and uses it as an argument in a command, which is later executed leading to arbitrary command injection.
if generate_caption_database:
if not os.path.exists(train_dir):
os.mkdir(train_dir)
run_cmd = f"{PYTHON} finetune/merge_captions_to_metadata.py"
if caption_extension == "":
run_cmd += f' --caption_extension=".caption"'
else:
run_cmd += f" --caption_extension={caption_extension}"
run_cmd += f' "{image_folder}"'
run_cmd += f' "{train_dir}/{caption_metadata_filename}"'
if full_path:
run_cmd += f" --full_path"
log.info(run_cmd)
if not print_only_bool:
# Run the command
if os.name == "posix":
os.system(run_cmd)
else:
subprocess.run(run_cmd)
Impact
These issues may lead to arbitrary command injection on the server. Please note that the issues affects posix systems. The command is also executed on Windows systems on line 56, but is run without invoking the shell (without shell=True
argument), which prevents exploitation via the PoC provided above.
Issue 5: path injection in common_gui.py
add_pre_postfix
function (GHSL-2024-023
)
add_pre_postfix
function takes user-controlled input from many sources, f.ex. the gradio_basic_caption_gui_tab
function in basic_caption_gui.py
variable images_dir
and uses it to create a path and write a file to it.
The vulnerability can only be exploited if the folder that the attacker wants to write to contains one of the files with the extensions: “.jpg”, “.jpeg”, “.png”, “.webp” due to this check. With this vulnerability, it is possible to control the contents of the file, the extension, but not the file name. Due to this limitations, it is a limited file write.
def add_pre_postfix(
folder: str = "",
prefix: str = "",
postfix: str = "",
caption_file_ext: str = ".caption",
) -> None:
if prefix == "" and postfix == "":
return
image_extensions = (".jpg", ".jpeg", ".png", ".webp")
image_files = [
f for f in os.listdir(folder) if f.lower().endswith(image_extensions)
]
for image_file in image_files:
caption_file_name = os.path.splitext(image_file)[0] + caption_file_ext
caption_file_path = os.path.join(folder, caption_file_name)
if not os.path.exists(caption_file_path):
with open(caption_file_path, "w", encoding="utf8") as f:
separator = " " if prefix and postfix else ""
f.write(f"{prefix}{separator}{postfix}")
else:
with open(caption_file_path, "r+", encoding="utf8") as f:
content = f.read()
content = content.rstrip()
f.seek(0, 0)
prefix_separator = " " if prefix else ""
postfix_separator = " " if postfix else ""
f.write(
f"{prefix}{prefix_separator}{content}{postfix_separator}{postfix}"
)
Issue 6: path injection in common_gui.py
find_and_replace
function (GHSL-2024-024
)
find_and_replace
function takes multiple user input folder_path
, caption_file_ext
, search_text
, replace_text
which allow for finding and replacing contents of any files on the system, that the attacker knows the contents of. For example, an attacker would be able to change the contents of a configuration file for a service with arbitrary data.
def find_replace(
folder_path: str = "",
caption_file_ext: str = ".caption",
search_text: str = "",
replace_text: str = "",
) -> None:
log.info("Running caption find/replace")
if not has_ext_files(folder_path, caption_file_ext):
msgbox(
f"No files with extension {caption_file_ext} were found in {folder_path}..."
)
return
if search_text == "":
return
caption_files = [f for f in os.listdir(folder_path) if f.endswith(caption_file_ext)]
for caption_file in caption_files:
with open(os.path.join(folder_path, caption_file), "r", errors="ignore") as f:
content = f.read()
content = content.replace(search_text, replace_text)
with open(os.path.join(folder_path, caption_file), "w") as f:
f.write(content)
Impact
These issues may lead to limited file write
.
CVE
- GHSL-2024-019 - CVE-2024-32022
- GHSL-2024-020 - CVE-2024-32026
- GHSL-2024-021 - CVE-2024-32025
- GHSL-2024-022 - CVE-2024-32027
- GHSL-2024-023 - CVE-2024-32024
- GHSL-2024-024 - CVE-2024-32023
Credit
These issues were discovered and reported by GHSL team member @sylwia-budzynska (Sylwia Budzynska). The vulnerabilities were found with help of CodeQL and additional CodeQL modeling.
Contact
You can contact the GHSL team at securitylab@github.com
, please include a reference to GHSL-2024-019
, GHSL-2024-020
, GHSL-2024-021
, GHSL-2024-022
, GHSL-2024-023
, or GHSL-2024-024
in any communication regarding these issues.