Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Moby dependencies, address skipped builds #828

Merged
merged 1 commit into from
Nov 13, 2023

Conversation

AaronFriel
Copy link
Contributor

@AaronFriel AaronFriel commented Oct 31, 2023

Adding additional logging to sess.Run() call here, and fixing moby/moby#45935, as a prerequisite to debugging #812.

While looking into #812, when using a locally built provider, updates to the Image resource would not result in a new image built and a corresponding new RepoDigest output value. This was root caused to the Moby issue below after adding a log statement following sess.Run() in provider/image.go.

However, the behavior seen while reproducing is different from our issue. That local repro was too consistent, it occurred in over 20 consecutive builds following the repro steps at the bottom of this commit.

After updating the Moby dependencies, the image build was no longer skipped and I was unable to reproduce #812. Neither Docker v23 release notes nor the git diff from 23.0.1 to 23.0.7 contain any obvious changes that explain that.

The most likely conclusion here is:

  1. There is still an unknown bug, likely a race condition, that can trigger issue 812, and I was unsuccessful in reproducing it.
  2. The Moby dependency update fixed an issue with the session dialing that has the same symptoms as issue 812, which is fixed in the change to client/client.go in the Moby diff above.

Repro

The Pulumi program in examples/multi-stage-build-go was run repeatedly to trigger #812 via:

$ uuidgen -r > ./app/example-lockfile.json && pulumi up --skip-preview --diff

When the symptoms occurred, verbose logging indicated the image build was skipped entirely - docker.ImageBuild did not run.

@github-actions
Copy link

Does the PR have any schema changes?

Does the PR have any schema changes?

Looking good! No breaking changes found.
No new resources/functions.

Maintainer note: consult the runbook for dealing with any breaking changes.

Adding additional logging to `sess.Run()` call here, and fixing moby/moby#45935,
as a prerequisite to debugging #812.

While looking into #812, when using a locally built provider, updates to the
Image resource would not result in a new image built and a corresponding new
`RepoDigest` output value. This was root caused to the Moby issue below after
adding a log statement following `sess.Run()` in `provider/image.go`.
- moby/moby#45935

However, the behavior seen while reproducing is *different* from our issue. That
local repro was too consistent, it occurred in over 20 consecutive builds
following the repro steps at the bottom of this commit.

After updating the Moby dependencies, the image build was no longer skipped and
I was unable to reproduce #812. Neither [Docker v23 release
notes](https://docs.docker.com/engine/release-notes/23.0/) nor the [git diff
from 23.0.1 to 23.0.7](moby/moby@v23.0.1...v23.0.7)
contain any obvious changes that explain that.

The most likely conclusion here is:
1. There is still an unknown bug, likely a race condition, that can trigger
   #812, and I was unsuccessful in reproducing it.
2. The Moby dependency update fixed an issue with the session dialing that has
   the same symptoms as #812, which is fixed in the change to `client/client.go`
   in the Moby diff above.

Repro
=====

The Pulumi program in `examples/multi-stage-build-go` was run repeatedly to
trigger #812 via:

```console
$ uuidgen -r > ./app/example-lockfile.json && pulumi up --skip-preview --diff
```

When the symptoms occurred, verbose logging indicated the image build was
skipped entirely - `docker.ImageBuild` did not run.

provider/go.mod #	modified:   provider/go.sum #	modified:   provider/image.go #
examples/multi-stage-build-go/go.sum #	modified:
examples/multi-stage-build-go/main.go #	modified:
provider/pkg/docs-gen/examples/image.md #
examples/aws-container-registry/ts/pulumi-up.log #
examples/multi-stage-build-yaml/ #	go.work #	go.work.sum #
@@ -253,7 +254,10 @@ func (p *dockerNativeProvider) dockerBuild(ctx context.Context,

//Start a session for BuildKit
if build.BuilderVersion == defaultBuilder {
sess, _ := session.NewSession(ctx, "pulumi-docker", "")
sess, err := session.NewSession(ctx, "pulumi-docker", identity.NewID())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does passing identity.NewID() do? Sorry if didn't parse it from your PR description.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The most complete permissively licensed client I've found is the Tilt tool, here:

https://github.com/tilt-dev/tilt/blob/a0891ef56f2d501969a27e1ecc781e2c8961cc0a/cmd/buildkitapi/main.go#L53

While we aren't using all of the same Buildkit options, I couldn't find any documentation on what the third parameter, SharedKey, does.

But it seems to me that we likely want to avoid any inadvertent sharing between concurrent Docker builds, so I followed the Tilt API's here.

I used this GitHub Code Search to try to find areas where we might deviate from norms: https://github.com/search?q=NewSession+DialHijack+lang%3Ago+-org%3Ahashicorp+-is%3Afork&type=code

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't entirely follow what the NewID() does for us.

The source code says that it generates a random identifier for cases where such are required. Is that so here, for our session?

More specifically - do we need this change for skipped builds to no longer happen?

Either way, it seems relatively low risk to add this.

@@ -253,7 +254,10 @@ func (p *dockerNativeProvider) dockerBuild(ctx context.Context,

//Start a session for BuildKit
if build.BuilderVersion == defaultBuilder {
sess, _ := session.NewSession(ctx, "pulumi-docker", "")
sess, err := session.NewSession(ctx, "pulumi-docker", identity.NewID())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to check #450 to understand whether the error was ignored on purpose, but I couldn't. @guineveresaenger do you remember?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not seem like we ignored this error on purpose. Explicitly returning this error looks correct to me. 👍

Copy link
Contributor

@guineveresaenger guineveresaenger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way that we can tweak the multi-stage-build test to fail if the image build is being skipped?

Other than that, this change looks good and low risk.

@@ -253,7 +254,10 @@ func (p *dockerNativeProvider) dockerBuild(ctx context.Context,

//Start a session for BuildKit
if build.BuilderVersion == defaultBuilder {
sess, _ := session.NewSession(ctx, "pulumi-docker", "")
sess, err := session.NewSession(ctx, "pulumi-docker", identity.NewID())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't entirely follow what the NewID() does for us.

The source code says that it generates a random identifier for cases where such are required. Is that so here, for our session?

More specifically - do we need this change for skipped builds to no longer happen?

Either way, it seems relatively low risk to add this.

@AaronFriel AaronFriel merged commit 7284bef into master Nov 13, 2023
15 checks passed
@AaronFriel AaronFriel deleted the friel/dep-update branch November 13, 2023 18:29
guineveresaenger added a commit that referenced this pull request Nov 13, 2023
Updates the docs with a note that says local repoDigest is now
supported.

See
[comment](#739 (comment))

I'd like to get this in because due to #828 being merged we are ready to
cut a patch release, which will ensure that the update appears in the
registry.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants