`zizmor` is a tool that uses static analysis to find potential security
issues in GitHub Actions [0]. (Yes, it's a bit absurd that GitHub
made a CI system so complicated that tools like this were created, but
I digress.)
Given our increase in GHA usage recently, I think this is a good step
towards keeping our security posture in tip-top shape. (It also keeps
with the theme of automating as many things as possible!)
The rule related to the usages of dangerous-triggers have been disabled
to avoid false-positives. Explanations about the usage of
`pull_request_target` and expectations around its usage can be found in
`.github/workflows/README.md`.
[0]: https://woodruffw.github.io/zizmor/
Co-authored-by: Thomas Gerbet <thomas@gerbet.me>
In a recent change, the path matching was simplified in maintainers.nix.
This revealed a pre-existing logic bug: Packages without `meta.position`
would get an empty string as their file name. The change would then
cause this empty string to always be matched, which lead to maintainer
pings for these packages in seemingly random PRs, when some of their
dependencies were changed.
We should never try to ping maintainers through package aliases, this
can only lead to errors. One example case is, where an attribute is a
throw alias, but then re-introduced in a PR. This would trigger the
throw. By disabling aliases, we can fallback gracefully.
Over the last couple of months we have been migrating a lot of the old
bash code to JavaScript, which is supported in GitHub Actions via
`actions/github-script`. This change documents a "manual ratchet check"
for this migration - new code should only be introduced as JavaScript
and not as Bash. This will help us to eventually succeed with the
migration and ensure quality and maintainability.
We are migrating to JavaScript, because:
1. Using JavaScript is GitHub's [recommendation] against injection attacks.
Using `actions/github-script` has first-class support for the event
context and does not require to resort back to environment variables in
most cases. When environment variables need to be used, these are
accessed via `process.env`, without a risk for accidental injections.
Using `actions/github-script` is also recommended in a recent
[survey] of open source supply chain compromises:
> Finally, since two out of three compromises were due to shell injection,
> it might be safer to use a proper programming language, like JavaScript
> with actions/github-script, or any other language accessing the context
> via environment variables instead of YAML interpolation.
2. Handling even environment variables in Bash safely is almost
impossible. For example arithmetic expressions cause arbitrary code
execution vulnerabilities. While a lot of contributors are somehwat
familiar writing Bash code for builders, writing *safe* Bash code for
CI is a very different matter. Few people, if any, know how to do
this.
3. GitHub Action's security model is quite unintuitive and even if some
code runs with trusted inputs today, it may later be used in a more
exposed context. Instead of making judgement calls about language
choice case by case, a clear policy helps writing things defensively
from the beginning.
4. We have developed a framework around our github-script based tools in
`ci/github-script`. This provides a local `nix-shell` environment
with the right dependencies and a local runner for these scripts for
quick testing, debugging and development. No matter, whether you're
developing a new feature, fixing bugs or reviewing a PR - this allows
much quicker verification of the scripts, *without* running
everything in a fork or test organization.
5. This framework also provides helpers for challenges that come up with
GHA. One example is rate-limiting, where we have a helper script that
will handle all rate-limiting needs for us, preventing us from
running out of API calls and thus breaking CI entirely. We can only
use these tools consistently, if we consistently use JavaScript code.
6. Using JavaScript allows us to handle JSON natively. Using
`octokit/rest.js` provides first-class integration with GitHub's API.
Together, this makes these scripts much more maintainable than
resorting to `gh` and `jq`.
[recommendation]: https://docs.github.com/en/actions/reference/security/secure-use#use-an-action-instead-of-an-inline-script
[survey]: https://words.filippo.io/compromise-survey/
Some PRs are empty on purpose, for example the yearly notification about
the election for voters. We should not close these because the merge
commit is empty - only if there was a change intended, but the merge
commit *becomes* empty, we should act.
The parse check runs multiple `nix-instantiate` processes in parallel -
and they can error out with "SQLite database '...' is busy" while
setting up the state directories. This was observed once locally.
Initialising the store should fix this.
There is no point in running the much slower `parse-each` part for each
interpreter/version. The CI job is not meant as a development tool that
should report all parse errors at once, but as a confirmation that no
parse errors are present on *different interpreter versions*.
Once this test fails, Eval, nixpkgs-vet and treefmt will most likely
fail as well - with more information for multiple parse errors.
This is another attempt at fixing the annoying nixpkgs-vet errors in CI,
which just throw with `error: SQLite database '...' is busy`.
The assumption is that this happens while initially setting up the state
directories. nixpkgs-vet runs `nix-instantiate` on both the base and the
head commit and these two could interfere.
If the change of a PR has already been merged to the target branch
elsewhere, the PR will not be auto-closed by GitHub - and will still
show the same original diff. Still, the temporary merge commit is
actually empty. This causes all kinds of strange CI behavior, from not
showing rebuilds to not pinging maintainers.
We check the merge commit during labeling anyway, to see whether a merge
conflict is present. It's easy to just look a the number of affected
files in this merge commit - and if there are none, we can just
automatically close the PR as no longer relevant.
This allows requesting reviewers for pure refactor PRs, which don't
cause a rebuild of the package. This is only possible for by-name,
because only here the package names can be inferred from the filenames.
This adds support to ping maintainers when arbitrary files in by-name
are changed, as long as they still cause a rebuild. For example, this is
the case when changing .json files with version metadata. These were
previously not detected as belonging to the package, and didn't cause
maintainer pings.
The only reason for the additional `lib.hasSuffix` check was, that the
`lib.removePrefix` was broken - it was never adjusted when porting this
from ofborg, so the relative path was wrong and no prefix ever removed,
since no packages are in `ci/`.
This additionally strips the leading `/`, so that `relevantFilenames`
will then have paths starting with `pkgs/...`, similar to how git
reports those paths in the `changedpathsjson` file. This allows simple
equality comparison.
See https://github.com/NixOS/nixpkgs/issues/437208#issuecomment-3288623669
Depends on https://github.com/NixOS/org/pull/172
As documented below, the idea is to essentially group all changes
rebuilding all VM tests with kernel updates and merge them together into
`master` whenever the Linux kernels get updated.
This documents the workflow of updates in the nixpkgs manual. While at
it, I removed the README from the packages because
* it's horribly outdated
* I didn't even know it exists which confirms that its discoverability
was very poor
and added the relevant portions into the nixpkgs manual as well.
This removes the "owners" check from codeowners-validator. With it, all
tokens and permissions can be removed, because these were only needed to
make these requests.
This solves the problem of codeowners-validator not supporting our new
nested team structure for nixpkgs-maintainers. To make the onboarding of
new teams easier, we moved all teams "under" the nixpkgs-maintainers
team. This makes them inherit the right privileges (triage) for Nixpkgs.
However, this inheritance is not recognized by codeowners-validator,
thus it assumes that these teams don't have access to Nixpkgs. This then
fails the owners check immediately.
Removing the owners check also has a few other advantages:
- This check depends on external state: If a user is renamed or a team
removed, the check will fail. This makes it a bad check for required
status checks or merge queues - the check might fail randomly,
independent of the current PR.
- Running this check in a fork will never work, because the respective
users and teams don't have access to the fork's repo.
Both of this required us to set `continue-on-error: true` most of the
time.
This reverts commit f8210561f3 (ci.eval.compare: turn warnings into errors, 2025-09-16).
It turns out that there are normal math warnings and we don't want to block CI on the math coming out wrong.