fix: better support for dangling revisions#11753
Conversation
9b2c66d to
ed4d553
Compare
28b0382 to
115ebfe
Compare
src/libfetchers/git.cc
Outdated
| try { | ||
| runProgram("git", true, { "-C", repoDir, "--git-dir", gitDir, "cat-file", "-e", input.getRev()->gitRev() }); | ||
| doFetch = false; | ||
| auto res = runProgram("git", true, { "-C", repoDir, "--git-dir", gitDir, "for-each-ref", fmt("--contains=%s", input.getRev()->gitRev()) }); |
There was a problem hiding this comment.
I don't know how clever git is about this combination of flags. A naive implementation of for-each, --contains would be slow when many refs need to be checked.
There was a problem hiding this comment.
I tested this on a repo with 45k refs on a branch with 377k commits and it took ~300ms. So, I thought I was good. But, your comment made me double-check against the linux kernel. That took 10m....... Something doesn't scale. I'll revert. It's not a huge deal if we don't do this, since we either have the rev (in which case we fetch the rev explicitly, like below) or we don't (in which case doFetch = true anyway).
| Activity act(*logger, lvlTalkative, actUnknown, fmt("making temporary clone of '%s'", repoDir)); | ||
| runProgram("git", true, { "-C", tmpDir, "fetch", "--quiet", "--force", | ||
| "--update-head-ok", "--", repoDir, "refs/*:refs/*" }, {}, true); | ||
| "--update-head-ok", "--", repoDir, "refs/*:refs/*", input.getRev()->gitRev() }, {}, true); |
There was a problem hiding this comment.
Wouldn't this cause the fetch to fail when the remote's uploadpack.allowReachableSHA1InWant is disabled?
There was a problem hiding this comment.
I'm not sure. There actually seem to be a few related options for configuring a git server and I'm not sure which combinations of them support fetching by rev. From here:
If the remote has enabled the options uploadpack.allowTipSHA1InWant, uploadpack.allowReachableSHA1InWant, or uploadpack.allowAnySHA1InWant, they may alternatively be 40-hex sha1s present on the remote.
But, I don't think remote configuration matters, here. In this case (regardless of the user-provided URL corresponding to either of a network-accessed repository or filesystem-accessed repository) the fetch, here, is against the fetcher cache whose config files have never specified (and don't even support, I think) any of these remote configuration values. Fetching by rev is supported according to my experimentation and the reproduction embedded in the commit message.
1dc363c to
dd279ac
Compare
The submodules branch generates a checkout which depends on only the
references, not all revisions. If there are dangling revisions in the
bare fetcher clone (a correctness issue itself) then the subsequent
temporary directory clone will disastrously fail with messages like
```
fatal: reference is not a tree: 2da0785fa32ce4c628d501c8743c88be00835b50
```
You should be able to reproduce this behavior with
```bash
this_nixpkgs='github:nixos/nixpkgs?rev=36ac8d7e411eeb11ac0998d5a39e329c1226e541';
this_git=$(nix build --print-out-paths "$this_nixpkgs#git")/bin/git;
export GIT_AUTHOR_NAME=foo \
GIT_AUTHOR_EMAIL=foo@foo.foo \
GIT_AUTHOR_DATE=100000000 \
GIT_COMMITER_NAME=foo \
GIT_COMMITER_EMAIL=foo@foo.foo \
GIT_COMMITTER_DATE=100000000;
tmpgit=$(mktemp -d);
echo "Repo'll be at $tmpgit";
"$this_git" init "$tmpgit";
"$this_git" -C "$tmpgit" commit --allow-empty -m "foo"; # 8c2146823865b76da067b3bb458611a0a19ede3b
"$this_git" -C "$tmpgit" commit --allow-empty -m "foo"; # 2da0785fa32ce4c628d501c8743c88be00835b50
"$this_git" -C "$tmpgit" reset --hard HEAD~1; # 8c2146823865b76da067b3bb458611a0a19ede3b
for version in 18 19 20 21 22 23 24; do
echo "Checking nix version 2.$version";
this_nix=$(nix build "$this_nixpkgs#nixVersions.nix_2_$version^out" --print-out-paths)/bin/nix;
url="file:///$tmpgit";
# populate fetcher-v1.sqlite and the gitv3/ dir
"$this_nix-build" --expr $'builtins.fetchGit {
url = \"'"$url"$'\";
rev = \"8c2146823865b76da067b3bb458611a0a19ede3b\";
submodules = true;
}';
# put the "corrupted" git dir in place of the fetched one
cache_dir="${XDG_CONFIG_HOME:-$HOME}"/.cache/nix/gitv3;
tmp=$(mktemp);
printf "%s" "$url" >"$tmp";
fetched_dir="$cache_dir"/$(printf "%s" "$url" | $this_nix --extra-experimental-features nix-command hash file --type sha256 --base32 "$tmp");
rm -rf "$fetched_dir";
git clone --bare "$tmpgit" "$fetched_dir";
# See if we can realise the derivation pointing to the dangling commit
_NIX_FORCE_HTTP=1 "$this_nix-build" --impure --expr $'
{ pkgs ? (import <nixpkgs> { }) }: pkgs.stdenv.mkDerivation {
pname = "foo";
version = "1.0.0";
dontUnpack = true;
'"
src = builtins.fetchGit {
url = \"file:///$tmpgit\";
rev = \"2da0785fa32ce4c628d501c8743c88be00835b50\";
submodules = true;
};
buildPhase = ''
touch \$out;
exit 0;
'';
}
"
done
```
This solution attempts to avoid the issue in the submodules path by
simply checking out the specified revision (in addition to all
references to preserve the original behavior - although, I think _only_
the revision is necessary).
dd279ac to
591a826
Compare
Motivation
A lot of this is laid out in #11752. I kindly ask reviewers to look at that issue.
Context
Pleae see #11752
Priorities and Process
Add 👍 to pull requests you find important.
The Nix maintainer team uses a GitHub project board to schedule and track reviews.