Coordinated Disclosure Timeline

Summary

aws/karpenter-provider-aws repository is vulnerable to Poisoned Pipeline Execution (PPE) which may lead to AWS Key exfiltration

Project

AWS Karpenter Provider

Tested Version

Latest commit at the time of reporting.

Details

Issue 1: Poisoned Pipeline Execution (PPE) in snapshot-pr.yaml workflow. (GHSL-2024-314)

The snapshot-pr.yaml worflow runs when the ApprovalComment workflow completes.

on:
  workflow_run:
    workflows:
      - ApprovalComment
    types:
      - completed

The snapshot job will download the artifact uploaded by the triggering workflow at uses: ./.github/actions/download-artifact and then read the PR_COMMIT and PR_NUMBER from the artifact metadata.txt file:

run: |
  pr_number="$(head -n 2 /tmp/artifacts/metadata.txt | tail -n 1)"
  pr_commit="$(tail -n 1 /tmp/artifacts/metadata.txt)"
  echo PR_COMMIT="$pr_commit" >> "$GITHUB_ENV"
  echo PR_NUMBER="$pr_number" >> "$GITHUB_ENV"

A malicious actor has two ways to control the contents of this metadata.txt file. First by submitting a Pull Request that modifies the ApprovalComment workflow so that it is triggered on pull_request events and then uploads an artifact with arbitrary contents. The downside for the attacker is that this Pull Request may be subject to approvals before running any workflows triggered by pull_request event. These approvals depend on the repository configuration and are enforced, by default, to first time contributors. Therefore the attacker could fix a typo o submit a valid contribution to become a contributor of the repository and skip explicit approvals for future Pull Requests. The second way to control the metadata.txt file is by submitting a Pull Request, and sumbitting a Pull Request Review with any of the magic words specified in the approval-comment.yaml workflow. In this case, the ApprovalComment workflow will be triggered no matter the approvals required since those only affect pull_request triggered workflows. In this second case, the contents of the metadata.txt will be the magic word, followed by the Pull Request Number and the Pull Request HEAD commit SHA:

run: |
  mkdir -p /tmp/artifacts
  { echo "$REVIEW_BODY"; echo "$PULL_REQUEST_NUMBER"; echo "$COMMIT_ID"; } >> /tmp/artifacts/metadata.txt
  cat /tmp/artifacts/metadata.txt

Taking that into consideration, when the snapshot-pr workflow runs the checkout step for the ${{ env.PR_COMMIT }} reference, it will be checking out the attacker-controlled Head of the Pull Request and, therefore, the files in the runner workspace cannot be trusted from this point on.

The snapshot-pr workflow then uses the id-token: write permissions to request a short-lived token to access AWS:

      - uses: aws-actions/configure-aws-credentials@e3dd6a429d7300a6a4c196c26e071d42e0343502 # v4.0.2
        with:
          role-to-assume: 'arn:aws:iam::${{ vars.SNAPSHOT_ACCOUNT_ID }}:role/${{ vars.SNAPSHOT_ROLE_NAME }}'
          aws-region: ${{ vars.SNAPSHOT_REGION }}

This step will request a token for a specific role (arn:aws:iam::021119463062:role/GithubSnapshot) and write the access keys to the shell environment (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN)

The workflow then executes arbitrary attacker-controlled code by running make snapshot since the Makefile is under attacker control.

- run: make snapshot

Impact

The snapshot job runs with pull-requests: write, id-token: write, and statuses: write permissions which will allow an attacker to get a short-lived token to access any resources that the GitHub’s OIDC has been granted access to. For the GithubSnapshot specific role, it seems like the attacker would be able to gain access to the docker hub credentials:

WARNING! Your password will be stored unencrypted in /home/runner/.docker/config.json.

In addition to the GitHubSnapshot role used by this specific workflow, the following roles are also used, and therefore available to an attacker:

The effective impact depends on the permissions that the AWS token has for each role which has not being evaluated since I have not attempted the attack. However, based on the role names, its safe to assume, that at least an attacker would be able to publish new releases and perform a supply chain attack.

Issue 2: Poisoned Pipeline Execution (PPE) in e2e-scale-trigger.yaml workflow. (GHSL-2024-315)

Similarly, the e2e-scale-trigger workflow gets triggered by the ApprovalComment workflow and calls resolve-args.yaml to parse the review body:

if: (github.repository == 'aws/karpenter-provider-aws' && (github.event_name != 'workflow_run' || github.event.workflow_run.conclusion == 'success')) || github.event_name == 'workflow_dispatch'
uses: ./.github/workflows/resolve-args.yaml
with:
  allowed_comment: "scale"

Note that for a workflow_run triggered workflow, the github.repository will always aws/karpenter-provider-aws and therefore, this check does not offer the same protection that as for a pull_request triggered workflow. This is because the workflow_run runs in the context of the default branch which belongs to aws/karpenter-provider-aws and pull_request runs in the context of the PR branch that belongs to the repository the PR originates from.

The resolve-args resusable workflow will basically decide if the workflows should run based on the body of the review (attacker-controlled) and the git reference to checkout based on the Pull Request the review is associated with:

- if: github.event_name == 'workflow_run'
  uses: ./.github/actions/download-artifact
- id: resolve-step
  env:
    ALLOWED_COMMENT: ${{ inputs.allowed_comment }}
  run: |
    if [[ "${{ github.event_name }}" == "workflow_run" ]]; then
      if [[ "$(head -n 1 /tmp/artifacts/metadata.txt)" == *"$ALLOWED_COMMENT"* ]]; then
         echo SHOULD_RUN=true >> "$GITHUB_OUTPUT"
      else
         echo SHOULD_RUN=false >> "$GITHUB_OUTPUT"
      fi
      echo GIT_REF="$(tail -n 1 /tmp/artifacts/metadata.txt)" >> "$GITHUB_OUTPUT"
    else
      echo SHOULD_RUN=true >> "$GITHUB_OUTPUT"
      echo GIT_REF="" >> "$GITHUB_OUTPUT"
    fi

The e2e-scale-trigger workflow will then call ./.github/workflows/e2e.yaml with the attacker-controlled GIT_REF:

    if: needs.resolve.outputs.SHOULD_RUN == 'true'
    uses: ./.github/workflows/e2e.yaml
    with:
      suite: Scale
      git_ref: ${{ needs.resolve.outputs.GIT_REF }}
      region: ${{ inputs.region || 'us-west-2' }}
      enable_metrics: ${{ inputs.enable_metrics || true }}
      workflow_trigger: "scale"
      # Default to true unless using a workflow_dispatch
      cleanup: ${{ github.event_name != 'workflow_dispatch' && true || inputs.cleanup }}
    secrets:
      SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL }}

The e2e.yaml workflow will checkout the untrusted code:

- uses: actions/checkout@9bb56186c3b09b4f86b1c65136769dd318469633 # v4.1.2
  with:
    ref: ${{ inputs.git_ref }}

And then run untrusted code. For example, the attacker could change the code for the commit-status local action and inject a shell script with the payload:

uses: ./.github/actions/commit-status/start

Impact

An attacker will be able to run arbitrary code in the context of the GitHub runner which will allow him to get a AWS JWT token with any of the above mentioned roles and perform any actions allowed for this token.

Issue 3: Poisoned Pipeline Execution (PPE) in e2e-matrix-trigger.yaml workflow. (GHSL-2024-316)

The e2e-matrix-trigger workflow similarly uses the resolve-args to process the ApprovalComment review body and extract the PR git reference.

if: (github.repository == 'aws/karpenter-provider-aws' && (github.event_name != 'workflow_run' || github.event.workflow_run.conclusion == 'success')) || github.event_name == 'workflow_dispatch'
uses: ./.github/workflows/resolve-args.yaml
with:
  allowed_comment: "snapshot"

The e2e-matrix job will then call the ./.github/workflows/e2e-matrix.yaml resusable workflow with the untrusted PR Git reference:

e2e-matrix:
  permissions:
    id-token: write # aws-actions/configure-aws-credentials@v4.0.1
    statuses: write # ./.github/actions/commit-status/start
  needs: [resolve]
  if: needs.resolve.outputs.SHOULD_RUN == 'true'
  uses: ./.github/workflows/e2e-matrix.yaml
  with:
    git_ref: ${{ needs.resolve.outputs.GIT_REF }}
    region: ${{ inputs.region || 'us-east-2' }}
    workflow_trigger: "matrix"
    # Default to true unless using a workflow_dispatch
    cleanup: ${{ github.event_name != 'workflow_dispatch' && true || inputs.cleanup }}
  secrets:
    SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL }}

The e2e-matrix.yaml workflow will call the e2e.yaml workflow which, as we already saw, will checkout the untrusted PR branch and execute local actions:

- uses: ./.github/workflows/e2e.yaml
  with:
    suite: ${{ matrix.suite.name }}
    git_ref: ${{ inputs.git_ref }}
    region: ${{ matrix.suite.region }}
    k8s_version: ${{ inputs.k8s_version }}
    cleanup: ${{ inputs.cleanup }}
    workflow_trigger: ${{ inputs.workflow_trigger }}
  secrets:
    SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL }}

Impact

An attacker will be able to run arbitrary code in the context of the GitHub runner which will allow him to get a AWS JWT token with any of the above mentioned roles and perform any actions allowed for this token.

Issue 4: Poisoned Pipeline Execution (PPE) in e2e-version-compatibility-trigger.yaml workflow. (GHSL-2024-317)

The e2e-version-compatibility-trigger workflow similarly gets called by the ApprovalComment workflow, processes the review body with resolve-args workflow and calls e2e-matrix which in turn, calls e2e resusable workflow which will checkout untrusted Pull Request branch and execute untrusted code.

jobs:
  resolve:
    if: (github.repository == 'aws/karpenter-provider-aws' && (github.event_name != 'workflow_run' || github.event.workflow_run.conclusion == 'success')) || github.event_name == 'workflow_dispatch'
    uses: ./.github/workflows/resolve-args.yaml
    with:
      allowed_comment: "versionCompatibility"
  versionCompatibility:
    permissions:
      id-token: write # aws-actions/configure-aws-credentials@v4.0.1
      statuses: write # ./.github/actions/commit-status/start
    needs: [resolve]
    if: needs.resolve.outputs.SHOULD_RUN == 'true'
    strategy:
      fail-fast: false
      matrix:
        k8s_version: ["1.25", "1.26", "1.27", "1.28", "1.29", "1.30", "1.31"]
    uses: ./.github/workflows/e2e-matrix.yaml
    with:
      region: ${{ inputs.region || 'eu-west-1' }}
      git_ref: ${{ needs.resolve.outputs.GIT_REF }}
      k8s_version: ${{ matrix.k8s_version }}
      workflow_trigger: "versionCompatibility"
      # Default to true unless using a workflow_dispatch
      cleanup: ${{ github.event_name != 'workflow_dispatch' && true || inputs.cleanup }}
      parallelism: 1
    secrets:
      SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL }}

Impact

An attacker will be able to run arbitrary code in the context of the GitHub runner which will allow him to get a AWS JWT token with any of the above mentioned roles and perform any actions allowed for this token.

Credit

This issue was discovered and reported by GHSL team member @pwntester (Alvaro Muñoz).

Contact

You can contact the GHSL team at securitylab@github.com, please include a reference to GHSL-2024-314, GHSL-2024-315, GHSL-2024-316, GHSL-2024-317 in any communication regarding this issue.