-
-
Notifications
You must be signed in to change notification settings - Fork 171
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add git-based detection of tags at HEAD to improve PublicRelease detection #876
Conversation
* Managed git does not support annotated tags yet * json output reads: "NBGV_BuildingTags": "System.Collections.Generic.List`1[System.String]",
In 3c93a89 I added a workaround for |
Throwing was intentional to match HeadCanonicalName behaviour. Throwing fails the tests though. Maybe this is because the tests run in a cloud build environment and HeadCanonicalName isn't used there? this.BuildingRef = cloudBuild?.BuildingTag ?? cloudBuild?.BuildingBranch ?? context.HeadCanonicalName; This would mean that the tests would currently fail when executed locally.
Return null if no HEAD could be determined, return an empty collection if there are no matching tags.
@AArnott anything I could do from my side to assist/simplify getting this merged? |
I'm looking at this now and preparing some local changes to push to your PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for putting this together. I have several comments I look forward to discussing with you.
var tagObjId = GitObjectId.Parse(line.Substring(0, 40)); | ||
var refName = line.Substring(41); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to not handle a line pattern that I see in one of my repos:
^3d2137a79ee4e3621696cfbde8d4f1c0e98bc5f6
40614e83b9b3892e05efd6155c3b61035bb5542a refs/tags/drop/a.proddiag/official.00000.00
^2f0a22666a958c15f7e7486a0a14490f7b8e210f
9aa2dc36733d3f3076702b31a3dc754c144a9473 refs/tags/drop/a.proddiag/official.26601.00
See the leading caret? I don't know what that means, but it seems your line parsing assumes no such prefix exists.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ^
prefix indicates a "peeled line" that lists the object ID referenced by the tag object on the previous line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
git's source supports that: https://github.com/git/git/blob/d9d677b2d8cc5f70499db04e633ba7a400f64cbf/refs/packed-backend.c#L542
A "record" here is one line for the reference itself and zero or one peel lines that start with '^'.
Thus, if I understand correctly, simply skipping these lines would be functionally equivalent to if they didn't exist, because we'd peel the refs in the next step. Having them does provide us with a way to speed things up though and hits the notch @AArnott mentioned here - we could decide if we're interested in this ref in place. Right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried a tag referencing another tag referencing a commit, and the peeled line after the first tag listed the object ID of the second tag, not the commit ID. So if you implement some optimisation that uses those lines, please consider this scenario.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to know, thanks! Actually I think this behaviour makes sense and is also in line with what the code in this PR currently does. My take on "tags at HEAD" behaves like git tag --points-at HEAD
would. Means it includes any lightweight as well as annotated tags pointing at HEAD. But not any tags that point at HEAD transitively. A git show HEAD
on the other hand does include transitive tags. Not including transitive tags might lead to less lookups => increased performance. In a broken repo, lookup including transitive tags could lead to an infinite loop.
I guess transitive tags are kind of an edge case so my plan was to just not support them at this point. I think specifying "We do what git tag --points-at HEAD
does" makes sense too. If the need to support transitive tags would arise, a second PR would probably be easier than this first one. What do you think @AArnott?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My main concern is that the behavior should not depend on whether the peeled lines are in packed-refs or not. Whether nbgv recursively follows tags in general is a matter of policy and either choice is fine with me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. You might want to take a look at my two recent commits in that regard.
{ | ||
if (objectId.Equals(tagObjId)) | ||
{ | ||
tags.Add(tagNameCandidate); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Filtering the tags at this point is quite late. NB.GV functions in some repos with tens of thousands of tags, and the code would have parsed and allocated multiple strings for all of them. I think we'll need to optimize this by walking the tags and immediately skipping lines with non-matching object IDs before we allocate anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that this can be solved without the candidate list and that the number of string allocations can be reduced. The current implementation without tag support does however allocate multiple strings per packed-ref line too and it doesn't seem to be a performance hit or is there some detail I'm missing?
Looking up the target of an annotated tag might lead to an IO operation - read the relevant git pack. Thus, I'm wondering if collecting the candidates first wouldn't have a minor impact compared to opening the file.
I don't have any that large repo at hand. If this turns out to have a substantial negative performance impact, it might be an option to make annotated tag support opt-in/out, as reading the packed-refs file itself (plus the files in refs/tags
) is sufficient for lightweight tag lookup. Edit: with the two commits incl. 49b9dc9 we don't need additional IO operations for annotated tags in packed-refs
if peel lines are available.
I'm working on a commit to remove the candidate list and reduce the string allocations though. Edit: see a84ad8b
Without this fix we might have considered one level of transitive annotated tags, if a peel line is present in packed-refs. With this fix we also avoid reading the git pack of annotated tags that have peel lines and do not match the object id we are looking for.
I thought there should be something to detect if reftable is used, and fall back to LibGit2Context in that case, but…
|
The build failure reproduces in main, so it's unlikely to be due to this PR. I don't know how that happened, but I'll be investigating it. |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
@georg-jung I was prepared to merge this, but happened to hit an error related to the change. Can you check this out?
|
@dotnet-policy-service agree |
Hey @AArnott, sorry for the delay. As a first measure, I tried to reproduce this and created a branch in my fork that incorporates the recent fixes in main (to make the pipelines in my repo work again) as well as this PR's code. I wasn't able to reproduce though. The stacktrace points to a piece of code that wasn't touched by this PR - altough I don't really understand the line number in the stacktrace, as that line only reads What are the chances this was just some temporary hick up with the pipeline? An IO error in the checkout? Does anything stop us form just running that pipeline again and see if its reproducible? |
Are you looking at the same error? This is what I see:
That looks extremely relevant to this PR, considering this PR introduces
Checkout is long behind this failure. And exactly the same error happened on both Windows and linux, so no, I don't think we can right this one off. Sometimes a failure can occur because of the particular way a git database is represented on disk, so that can be a challenge to repro on another clone of the repo. But it's not something we want to sweep under the rug, because if it can happen in a PR build, it can certainly happen on a dev box among the many thousands that this code runs on. |
I've been looking this over to try to figure out what may be going wrong, and it occurs to me that we don't have a single new test for the tags functionality in this PR. Sorry I missed this before, but we will need some tag tests to accept this PR as well. |
I added a test for Tags to your PR. If you can think of anything else to test, you're welcome to. We have the facility to check in whole repo databases in order to be able to test particular git db formats (pack files, peeled lines, etc) too. |
... and the re-build failed at exactly the same place, in the same way. So, at least it's reproducible. I can't reproduce it locally (yet). |
I pushed a commit that will capture the entire clone, including .git directory, next time it fails. That should help us repro it. |
Yay! I have a local repro by downloading the drop I changed the pipeline to produce on failure. I'm investigating now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a legitimate failure, introduced by the GitAnnotatedTagReader
code, apparently.
You can repro by following these steps:
- Download the
drop
artifact from here and use 7-zip to extract it. Complete all the rest of these steps within the unzipped directory. - Open VS to the solution within the unzipped directory.
- Add
Debugger.Launch()
to theGitObjectId.ParseHex
method. - Run
dotnet publish -o ../nerdbank-gitversioning.npm/out/nbgv.cli/tools/net6.0/any
from thesrc\nbgv
directory. - Run
yarn build
from thesrc\nerdbank-gitversioning.npm
directory. - When the JIT debugger attach dialog appears, have the VS with your unzipped solution open attach to the debuggee.
- Configure the debugger to break on first chance exceptions, at least for
IndexOutOfRangeException
. GitException
is thrown a lot due to the issue of relying on exceptions to recognize the wrong object type. You can turn off break on first chance exceptions for this particular exception type.- Hit F5 till you encounter a debug assertion failure and/or the
IndexOutOfRangeException
that causes the ultimate failure.
Can you please look into this?
Also, given the GitException
is thrown about 200 times before this failure, this is likely going to need to be addressed. I already have a local code change that I hope will address this, although it causes other failures, so I'll work through that after we get this fatal error resolved.
Fixes #873.
This implements my second idea described in detail in #873.
My thoughts form there:
This comes with parsing support for annotated tags for managed git and LibGit2. Lightweight tags pointing HEAD are considered as well as annotated tags. Nested annotated tags that indirectly point at HEAD are not considered (intentionally, changing probably wouldn't be hard).
BuildingRef
value is not changed.Currently, the output of
nbgv get-version -f json
is extended as follows:Obviously theI added anNBGV_BuildingTags
value isn't here to stay. I thought I'd leave this for discussion, which of these values should be generated and what their value should be.[Ignore]
to skip it in a3089b1.Probably I made some opinionated decisions when putting this together, don't hesitate to change what you don't like.