Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Git hashing for Git fetching #9025

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

Ericson2314
Copy link
Member

@Ericson2314 Ericson2314 commented Sep 21, 2023

Motivation

The git fetcher is now more tree-hash-oriented, and we will want to integrate this with git fetching eventually. This PR exposes treeHash inputs and outputs in a few ways for this purpose.

Eventually, we should add something like builtins.derivation's outputHashMode to builtins.fetchTree, in order to specify we should use git hashing, and then this and the store-layer git hashing should meet together, ensuring we have the same tree hash end-to-end.

Context

Part of RFC 133

Tracking issue #8919

depends on #8918

Priorities

Add 👍 to pull requests you find important.

@github-actions github-actions bot added new-cli Relating to the "nix" command with-tests Issues related to testing. PRs with tests have some priority store Issues and pull requests concerning the Nix store fetching Networking with the outside (non-Nix) world, input locking labels Sep 21, 2023
@Ericson2314 Ericson2314 changed the title Git hashing for fetching Git hashing for Git fetching Sep 21, 2023
@Ericson2314 Ericson2314 force-pushed the git-hashing-for-fetching branch from 79fc9f6 to cdbbc69 Compare September 21, 2023 15:28
Copy link
Member

@roberth roberth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursory review

src/libexpr/primops/fetchTree.cc Outdated Show resolved Hide resolved
src/libfetchers/git.cc Outdated Show resolved Hide resolved
src/libfetchers/git.cc Outdated Show resolved Hide resolved
src/libstore/store-api.cc Outdated Show resolved Hide resolved
src/libutil/fs-sink.hh Outdated Show resolved Hide resolved
@github-actions github-actions bot removed the with-tests Issues related to testing. PRs with tests have some priority label Oct 6, 2023
@Ericson2314 Ericson2314 force-pushed the git-hashing-for-fetching branch from 88e2bda to 4c8736c Compare October 6, 2023 16:43
@github-actions github-actions bot added the with-tests Issues related to testing. PRs with tests have some priority label Oct 6, 2023
@Ericson2314 Ericson2314 force-pushed the git-hashing-for-fetching branch from 4c8736c to e4c7e89 Compare October 6, 2023 16:45
Copy link

dpulls bot commented Feb 28, 2024

🎉 All dependencies have been resolved !

@Ericson2314 Ericson2314 added this to the git-hashing stabilisation milestone Mar 22, 2024
@Ericson2314 Ericson2314 force-pushed the git-hashing-for-fetching branch 3 times, most recently from f0d1e1a to 33e945b Compare March 25, 2024 16:31
@Ericson2314 Ericson2314 marked this pull request as ready for review March 25, 2024 16:31
@Ericson2314 Ericson2314 requested a review from edolstra as a code owner March 25, 2024 16:31
Ericson2314 and others added 2 commits March 27, 2024 10:38
The git fetcher is now more tree-hash-oriented, and we will want to
integrate this with git fetching eventually. This PR exposes `treeHash`
inputs and outputs in a few ways for this purpose.

Eventually, we should add something like `builtins.derivation`'s
`outputHashMode` to `builtins.fetchTree`, in order to specify we should
use git hashing, and then this and the store-layer git hashing should
meet together, ensuring we have the same tree hash end-to-end.

Part of RFC 133

Co-Authored-By: Matthew Bauer <[email protected]>
Co-Authored-By: Carlo Nucera <[email protected]>
Co-authored-by: Robert Hensing <[email protected]>
@Ericson2314 Ericson2314 force-pushed the git-hashing-for-fetching branch from 33e945b to a8314ff Compare March 27, 2024 14:40
@roberth
Copy link
Member

roberth commented Mar 27, 2024

and then this and the store-layer git hashing should meet together

They already meet, and they involve a transformation that turns git trees without submodules and filtering into git trees where those things are resolved.
These are in general different trees, even if they happen to line up exactly in the simple cases.

Could you clarify what "meeting" entails and how we benefit from that?

if (auto narHash = input.getNarHash()) {
attrs.alloc("narHash").mkString(narHash->to_string(HashFormat::SRI, true));
} else if (auto treeHash = input.getTreeHash()) {
attrs.alloc("treeHash").mkString(treeHash->to_string(HashFormat::SRI, true));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have two kinds of tree hashes:

  • tree hashes from git commits, which do not correspond to fetched trees in general
  • tree hashes from file system objects returned by fetchTree

Which one is this, and what is the purpose of exposing it to the expression language?

Copy link
Member

@roberth roberth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • The meaning of the tree hash must be unambiguous.
  • We need a concrete benefit to users before exposing arbitrary data to the expression language.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fetching Networking with the outside (non-Nix) world, input locking new-cli Relating to the "nix" command store Issues and pull requests concerning the Nix store with-tests Issues related to testing. PRs with tests have some priority
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants