Skip to content

Runs-on for linux-build-lib and linux-test (2X faster CI)#20107

Merged
comphead merged 3 commits intoapache:mainfrom
blaginin:db/runs-on-step-1
Feb 3, 2026
Merged

Runs-on for linux-build-lib and linux-test (2X faster CI)#20107
comphead merged 3 commits intoapache:mainfrom
blaginin:db/runs-on-step-1

Conversation

@blaginin
Copy link
Collaborator

@blaginin blaginin commented Feb 2, 2026

Which issue does this PR close?

Related to #13813

Thanks to the infra team and @gmcdonald specifically, we now have the ability to use more powerful AWS-provided runners in our CI 🥳

DataFusion has one of the largest runtimes across Apache projects - that's why we're bringing those runners here first. Since we're first to test this, I think it's reasonable to do a gradual transition, so I updated the two most frequently failing actions to be hosted in AWS. The plan is to test that everything works fine and then transition the remaining actions.

What changes are included in this PR?

If the org is apache, we'll now use ASF-provisioned runners in the ASF infra AWS account. Forks will not have access to those runners, so they will fall back to GitHub-provisioned ones.

Are these changes tested?

Yes.

Are there any user-facing changes?

No

@blaginin blaginin self-assigned this Feb 2, 2026
@github-actions github-actions bot added documentation Improvements or additions to documentation development-process Related to development process of DataFusion labels Feb 2, 2026
@martin-g
Copy link
Member

martin-g commented Feb 2, 2026

Nice !

@blaginin blaginin requested review from alamb and findepi February 2, 2026 12:49
Copy link
Contributor

@comphead comphead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @blaginin lets give it a try

Are you aware if any limits applies for AWS runners, I checked https://docs.github.com/en/actions/reference/limits but cannot say if it is for AWS runners

@blaginin
Copy link
Collaborator Author

blaginin commented Feb 2, 2026

Are you aware if any limits applies for AWS runners, I checked https://docs.github.com/en/actions/reference/limits but cannot say if it is for AWS runners

Yes, those limits do not apply to us. We will be bounded by the ASF AWS Account budget and service quotas, this will be on me and infra team to monitor

I will merge this PR soon and we can start testing 🚀

@alamb
Copy link
Contributor

alamb commented Feb 2, 2026

I think it is a great idea to try one or two jobs initially and then slowly migrate over

The biggest concern I have is if there is some difference between the normal runners (that will run on PRs) and the one that will run on main.

Copy link
Contributor

@Jefffrey Jefffrey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DataFusion has one of the largest runtimes across Apache projects

😅

@comphead
Copy link
Contributor

comphead commented Feb 3, 2026

If everyone is okay, I'm merging the PR

@comphead comphead added this pull request to the merge queue Feb 3, 2026
Merged via the queue into apache:main with commit 96a6bd7 Feb 3, 2026
29 checks passed
@blaginin
Copy link
Collaborator Author

blaginin commented Feb 3, 2026

The biggest concern I have is if there is some difference between the normal runners (that will run on PRs) and the one that will run on main

Those on main will also use AWS runners, here's an example 🙂

linux-build-lib:
name: linux build test
runs-on: ubuntu-latest
runs-on: ${{ github.repository_owner == 'apache' && format('runs-on={0},family=m7a,cpu=16,image=ubuntu24-full-x64,extras=s3-cache,disk=large,tag=datafusion', github.run_id) || 'ubuntu-latest' }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

github.repository_owner == 'apache'

❤️

@alamb
Copy link
Contributor

alamb commented Feb 4, 2026

BTW I don't have data, but this feels like it made a major difference on CI time recently

@martin-g
Copy link
Member

martin-g commented Feb 4, 2026

Yes, it is around twice faster now for the affected CI jobs!

@comphead
Copy link
Contributor

comphead commented Feb 5, 2026

Thanks @blaginin I think we need the same for Comet, as its heavily tested. Appreciate if you can share how to start

@blaginin
Copy link
Collaborator Author

blaginin commented Feb 5, 2026

Appreciate if you can share how to start

Of course! I'll switch the remaining DF actions and then happy to do the same with Comet. I'll also write a guide for transitioning :)

@comphead
Copy link
Contributor

comphead commented Feb 5, 2026

Appreciate if you can share how to start

Of course! I'll switch the remaining DF actions and then happy to do the same with Comet. I'll also write a guide for transitioning :)

thanks and FYI created apache/datafusion-comet#3404

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

development-process Related to development process of DataFusion documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants