The `cancelled()` condition seems to only apply when *the whole
workflow* was cancelled. This is not the case when a single job is
cancelled due to timeout.
We can replicate this by checking each needs.result manually.
This currently times out after 3 minutes. Give it a bit more time. 10
minutes might be excessive, but we only really want to guard against a
stuck job taking 6 hours.
This currently happens, for still unknown reasons, for the "check cherry
picks" job. The job gets cancelled by GHA mid-way. This should be the
same as an error, because an important check didn't run: Merging should
be blocked and auto-merge should not succeed.
The conclusion of the `versions` job propagates from through `eval` to
`compare`, which meant the `compare` job was skipped. No rebuild labels,
no reviewer requests.
Also, we don't want to run eval when `versions` runs, but fails.
The additional `workflows` permissions are required to backport
Dependabot updates. The permissions had been added to the app a while
ago, but we forgot to actually use them.
With this change, we start running Eval on all available Lix and Nix
versions. Because this requires a lot of resources, this complete test
is only run when `ci/pinned.json` is updated.
The resulting outpaths are checked for consistency with the target
branch. A difference will cause the `report` job to fail, thus blocking
the merge, ensuring Eval consistency for Nixpkgs across different
versions.
This implements a kind of "ratchet style" check: Since we originally
confirmed that the versions currently in Nixpkgs at the time of this
commit match Eval behavior of Nix 2.3, we can ensure consistency with
Nix 2.3 down the road, even without testing for it explicitly.
There had been one regression in Eval consistency for Nix between 2.18
and 2.24 - two tests in `tests.devShellTools` produce different results
between Lix 2.91+ (which was forked from Nix 2.18) and Nix 2.24+. I
assume it's unlikely that such a change would be "fixed" by now, thus I
added an exception for these.
As a bonus, we also present the total time in seconds it takes for Eval
to complete for every tested version in a summary table. This allows us
to easily see performance improvements for Eval due to version updates.
At this stage, this time only includes the "outpaths" step of Eval, but
not the generation of attrpaths beforehand.
The Dependabot update change the hashes to the latest main branch commit
instead of the v5.0.0 tag - also it didn't adjust the tags in the
comments accordingly. Last but not least, one of the references used a
`@v5` reference instead of the commit hash. The latter is probably what
Dependabot tripped on.
This is slightly faster than downloading and extracting a tarball and
additionally allows a sparse checkout. No need to download docs or nixos
for our purpose.
The data is quite noisy, but suggests improvements from anywhere between
5-15 seconds for each job using the pinned nixpkgs.
None of our jobs is expected to run for 6 hours, the GitHub limit. These
limits are generous and take into accounts that some jobs need to wait
for others.
If jobs exceed these times, most likely something else is wrong and
needs investigation.
This reverts commit f2648b263b.
While the idea to never use swap was fine, in practice this meant that
when nix ran OOM, some other process was killed instead. This lead to
the job not being possible to be cancelled anymore and thus needing to
timeout, before subsequent jobs could be scheduled. This can take up to
6 hours for GitHub Actions by default.
Re-enabling the swap file to catch this case more gracefully. It's still
the goal to never actually *use* the swap file during Eval and just a
safeguard.
Keeping the changed chunkSize and not reverting it - this makes it
slightly less likely to hit the swap file when running with Lix.
Recent performance tests show that (a) swapping heavily slows down the
Eval job, while (b) lowering the chunkSize does not have an effect on
run-time. It does on memory usage, though - thus we can get rid of
swapping entirely by reducing chunkSize respectively.
Introduces a basic merge queue workflow to initially only run lints.
This will avoid accidentally merging changes which break nixfmt after
its recent update to 1.0.0.
This adds a build job for the tarball, which might help uncover eval
issues on attributes not normally touched by Eval, aka those added in
`pkgs/top-level/packages-config.nix`.
Most of the checks we do for cherry-picks are dismissable warnings, with
one exception: When a commit hash has been found, but this hash is not
available in any of the pickable branches, we raise this with
severity=error. This should also *block* the merge and not be
dismissable. That's because this is a fixable issue in every case.
This turns the check-cherry-pick script into a github-script based
JavaScript program. This makes it much easier to extend to check reverts
or merge commits later on.
Since all github-scripts need to be written in commonjs, we now default
to it by not setting package.json. Support from editors for .js files is
slightly better than .cjs. To still allow using module imports in the
test runner script, we trick node into loading the script itself as a
module again via `--import ./run`.
This just moves things around to use less specific naming - `labels` is
only *one* script that can potentially be run locally while still being
written in github-script. Later, we can add more.
When a PR is merged and labeled afterwards - with a non-backport label -
the following will happen:
- The first backport job is triggered on the merge.
- The second backport job is triggered on the label event.
- The second job will cancel the first one due to the concurrency group.
- The second job will cancel itself because the label event didn't
contain a backport label.
Both jobs end up cancelled and no backport happens.
We made the backport action idempotent upstream a while ago, so we don't
need to cancel those actions. Instead, we'll run all of them -
subsequent actions running through will just stay silent anyway.
Committers could get the false impression from, e.g., `PR / Build / aarch64-linux` that this workflow builds the packages changed in the current PR. Such a misunderstanding could pair poorly with the "enable auto-merge" button, once that's enabled.
By re-organizing the flow in `handle()` we can start labeling both
issues and pull requests, and only make the relevant API requests for
the PR-case.
At first glance, we might think that we only need to label the big batch
list of issues and not those recently updated: But that's wrong, for
recently updated issues it's important to label quickly, because the
stale label needs to be *removed*, too.
We already tried to fix this case earlier, but didn't account for all
cases: A scheduled workflow can also encounter a pull request with
failed PR workflow. This failure doesn't need to be in the Eval part, so
artifacts could *still* be available. To make sure PRs always get
rebuild labels, just ignore the status condition. Either the artifact is
there, or it is not.
The previous implementation had two problems:
- When switching from /search to /pulls, we disabled the additional GET
on each single pull request - which causes no test merge commit creation
for all PRs. This means, merge conflicts will not actually be detected.
- By using `item` in the pull-request triggered case, this goes back to
`context.payload.pull_request`, which is the state *at the beginning* of
the workflow run. But this renders our "let's wait 3 minutes before
checking merge_commit_sha" logic void. While we wait for 3 minutes, we
still use the *old* value afterwards...
Just making the extra request every time simplifies the logic and solves
both problems.