Pulumi thinks it needs to replace ECS service and task definition even though no changes #23

kennyjwilli · 2018-09-27T17:38:22Z

Every time I switch computers and run pulumi update, Pulumi thinks it needs to replace my ECS task definition and update the service even though nothing has changed. The only thing that is different in the containerDefinition is the sha256 for IMAGE_DIGEST in the environment list.

From @lukehoban on Slack:

Yes - we currently inject the docker image ID as the IMAGE_DIGEST and it may be that docker build will produce different IDs for the same sources on different machines (perhaps even if there are difference docker versions?). We are looking to making some changes to how we lock the digest version - and this is a factor we'll want to take into consideration.

The text was updated successfully, but these errors were encountered:

CyrusNajmabadi · 2018-10-15T19:12:59Z

@kennyjwilli Do you have a repro for this? I'm not seeing this. Is this a multi-machine scenario?
@lukehoban can you fill me in on your thoughts about:

We are looking to making some changes to how we lock the digest version - and this is a factor we'll want to take into consideration.

kennyjwilli · 2018-10-15T19:15:23Z

Not sure what exactly you mean by multi-machine scenario. This specific ticket is about running pulumi update on two different computers and Pulumi thinking it needs to replace the ECS task.

CyrusNajmabadi · 2018-10-15T19:19:53Z

Not sure what exactly you mean by multi-machine scenario. This specific ticket is about running pulumi update on two different computers and Pulumi thinking it needs to replace the ECS task.

Yup! That's just what i wanted to verify for certain. I'll wait to hear back from Luke about his thoughts on how we can be resilient to this.

joeduffy · 2018-10-15T19:29:32Z

@CyrusNajmabadi I guess I would love to know first whether Docker builds on different machines necessarily imply different SHA hashes? I would have assumed "no", that identical build across machines -- provided the contents are identical -- would end up with identical hashes.

If that's true, it could be that in this specific case, there's some machine-specific info making its way into the container image somehow.

hausdorff · 2018-10-15T21:03:58Z

IIRC the situation is thus:

It is nice to decouple image tag and unique containerID (which is a SHA). This allows users to write image("nginx:alpine") and then during preview see whether the SHA underneath that tag has changed. This gives you a very high degree of reproducible deployments, since you know precisely what container you're deploying. (This comes from the work we did on ksonnet.)
The problem is, the container ID seems to be granted by the registry. So in ksonnet we'd just ask the registry "what's the ID/SHA for this image tag"? And what we'd get back is the SHA we'd use to precisely identify the version of the container we want to run.
Here we're taking the SHA from the Docker daemon, which seems to differ per machine.
Worse, if we wait until we push to resolve the container ID, we will conservatively report everything needs to be re-done in preview, since we can't know the container ID ahead of time.

ericrudder · 2018-10-15T21:05:44Z

make sure that the timestamps on the files are identical in addition to the content, since the FS inputs into the hash.

…

On Mon, Oct 15, 2018 at 12:29 PM Joe Duffy ***@***.***> wrote: @CyrusNajmabadi <https://github.com/CyrusNajmabadi> I guess I would love to know first whether Docker builds on different machines necessarily imply different SHA hashes? I would have assumed "no", that identical build across machines -- provided the contents are identical -- would end up with identical hashes. If that's true, it could be that in this specific case, there's some machine-specific info making its way into the container image somehow. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#23 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AH_5wvNLorF1gTfaSvg90nK0hpkQuCmDks5ulOIdgaJpZM4W9IsB> .

joeduffy · 2018-10-15T21:27:55Z

Here we're taking the SHA from the Docker daemon, which seems to differ per machine.

Why would it differ per machine?

If it's a timestamp, for example, than it's actually a different image...

hausdorff · 2018-10-15T21:34:22Z

I don't think we know why it differs per machine. Timestamp seems likely, but at least I have not dug in. I do think from this information we can conclude that we have to use a different mechanism to identify whether a container is unique, especially since we are not aware of any published guarantees about what this SHA is generated from.

joeduffy · 2018-10-15T21:39:50Z

I'd love to see what someone thought it would be produced from before assuming we can't depend on it.

CyrusNajmabadi · 2018-10-18T17:46:21Z

@kennyjwilli I wasn't able to actually repro this myself. I'm wondering if this is something particular to your stack. Would it be possible for you to share the docker-file+build-folder with me (and also the code you use to create the service)? I'd like to see if it's something in particular about that docker setup. Thanks!

CyrusNajmabadi · 2018-10-18T17:49:04Z

@kennyjwilli Also, what version of docker are you using on these boxes? Thanks!

CyrusNajmabadi · 2018-10-19T19:57:35Z

I've been able to repro this, and have traced it down to a docker design decision documented here: https://github.com/moby/moby/blob/master/image/spec/v1.md

Specifically, when docker produces the images for layers, they embed a "created" timestamp in the metadata for that image. When producing the final image, the informatino about this is contained in 'json' files that are then eventually tarred up. That final tar is hashed, and will then be different for every different machine you run in.

I'm looking around to see if there's any way to avoid this. Some way, perhaps to force a specific date for docker to use here. Absent that, this may just be how docker works, and it may be hte case that with/without pulumi you would just be experiencing this no matter what.

CyrusNajmabadi · 2018-10-19T20:08:04Z

Ok. I spelunked through the docker code, and i couldn't find any way to avoid this. Furthermore, i'm virtually certain that if you were doing this manually (i.e. without pulumi), you'd be running into this.

Once thing you can do to try to help avoid this is to clear your docker cache on one of your machines, then export your docker image from one and import it on hte other. You can use docker save and docker load to do this.

If you end up doing this, i believe then that you'll have the right images on both machines that docker will reuse, without it then wanting to create new images that it will embed timestamps into.

CyrusNajmabadi · 2018-10-19T20:53:13Z

Closing out. Note: @kennyjwilli asked if there was any way for pulumi to pull down the built images that had been published. I believe that the 'cacheFrom' property here is intended to help wiht that: https://github.com/pulumi/pulumi-cloud/blob/d8315b6ff7de0e76ad8aa7c4195335493b199988/api/service.ts#L161

However, i'm personally unfamiliar with how it works. @pgavlin (who is on vacation right now) may be able to help guide you through using this. For now, do you want to try passing along 'true' to that value to see if it helps out?

joeduffy · 2018-10-25T16:31:14Z

This doesn't seem like a satisfactory outcome. I can imagine this is going to be a common issue for anybody trying to do CD of Dockerized services with Pulumi.

@CyrusNajmabadi @lukehoban @hausdorff Thoughts on what we can do here?

lukehoban · 2018-10-26T17:33:59Z

From our last discussion on this, the path forward on several related issues like this was to do two things:

Use an Archive to track changes to the sources of the Docker build folder (and Dockerfile is outside that folder) within the Pulumi resource model.
Move docker.Image over to being a CustomResource that can fully participate in the Pulumi resource dependency graph so that it can know to only re-build and re-push when there are changes in the Archive hash.

For (2), this could be accomplished either with a dynamic provider, or via moving this whole package over to be a true Pulumi Provider written in Go.

Short of doing that, we could not think of any robust way to use docker itself to reliably handle these issues.

Relying on Archive hash semantics instead of Docker build cacheing is a little worrying, just because it's a different semantics. But it should be a conservative additional layer of "caching", and relying on docker build cacheing already provides limited guarantees on if/when layers will get re-built even if the rebuild may cause different contents to be created (timestamps in builds, different bits from npm install, etc.).

So we have what we think is a path to addressing this class of issues. But it will be a pretty significant overhaul of this library. And the right thing if we go this direction is probably to move to a real provider - which would be a complete re-write.

CyrusNajmabadi · 2018-10-26T17:52:55Z

This doesn't seem like a satisfactory outcome. I can imagine this is going to be a common issue for anybody trying to do CD of Dockerized services with Pulumi.

@joeduffy This appears to be an issue for anyone doing CD of dockerized services, regardless of if they're using pulumi or not.

As luke mentions, we discussed an alternative approach here. But @hausdorff was tasked with taking htat on, as he has the most context on this space, and on doing a revamp to a dynamic provider based approach. I'm going to assign this over to him, unless there's already another bug tracking this work (@hausdorff , you mentioned you were going to create one to track the results of our convo?)

lukehoban · 2019-03-09T20:15:57Z

This will require a more or less complete overhaul of this library per #23 (comment) - and we haven't started work on it - so it won't get done in M21. This remains a very high priority issue that we will need to find time to prioritize.

ameier38 · 2019-09-20T13:19:59Z

@lukehoban I think an interesting benefit, if possible, of using an Archive for the Docker context would be to include files that are in different directories. For my use case I have a directory called protos in which I store protobuf definitions and then generate the code stubs using Uber's prototool. In order to build my services I first have to copy these stubs into the service directory in order to build the Docker image. I think it would be nice feature to be able to track when the generated stubs have changed and automatically update the build context to keep the service and stubs in sync. We can't currently mount a volume during the build which would also solve this for me.

Also, having the Docker image show a change when running pulumi preview would be really nice. Sometimes I am just building an image in a stack and exporting the image name and I can't see the change until I run a full update and view the outputs. Edit: this works in the latest version 👍

Blitz2145 · 2019-11-15T20:21:15Z

I've been able to repro this, and have traced it down to a docker design decision documented here: https://github.com/moby/moby/blob/master/image/spec/v1.md

Specifically, when docker produces the images for layers, they embed a "created" timestamp in the metadata for that image. When producing the final image, the informatino about this is contained in 'json' files that are then eventually tarred up. That final tar is hashed, and will then be different for every different machine you run in.

I'm looking around to see if there's any way to avoid this. Some way, perhaps to force a specific date for docker to use here. Absent that, this may just be how docker works, and it may be hte case that with/without pulumi you would just be experiencing this no matter what.

@CyrusNajmabadi Maybe in buildkit the new docker image builder, they will take a PR to put in reproducible timestamps (some discussion here moby/buildkit#1058) which might be a route to tackle spurious image builds, rather than going the archive route.

AaronFriel · 2023-03-09T01:13:05Z

In the new implementation of the Docker Image resource in v4, a new image is not built unless the provider detects a change in the build context. See our blog post for more info: https://www.pulumi.com/blog/build-images-50x-faster-docker-v4/

lukehoban self-assigned this Oct 1, 2018

lukehoban added this to the 0.18 milestone Oct 1, 2018

lukehoban assigned hausdorff and unassigned lukehoban Oct 4, 2018

lukehoban assigned CyrusNajmabadi and unassigned hausdorff Oct 15, 2018

CyrusNajmabadi closed this as completed Oct 19, 2018

joeduffy reopened this Oct 25, 2018

joeduffy modified the milestones: 0.18, 0.19 Oct 25, 2018

CyrusNajmabadi assigned hausdorff and unassigned CyrusNajmabadi Oct 26, 2018

lukehoban unassigned hausdorff Nov 7, 2018

lukehoban added customer/feedback Feedback from customers priority/P1 labels Nov 16, 2018

lukehoban modified the milestones: 0.20, 0.21 Jan 10, 2019

lukehoban assigned hausdorff and unassigned CyrusNajmabadi Jan 10, 2019

lukehoban modified the milestones: 0.21, 0.22 Mar 9, 2019

lukehoban modified the milestones: 0.22, 0.23 Apr 1, 2019

hausdorff removed this from the 0.23 milestone May 6, 2019

pgavlin added the feature/q3 label Jul 19, 2019

lukehoban removed feature/q3 labels Jul 25, 2019

lukehoban assigned stack72 and unassigned hausdorff Nov 27, 2019

lukehoban mentioned this issue Jan 29, 2020

Make Image a real CustomResource #132

Closed

lukehoban mentioned this issue Feb 7, 2020

Perform docker builds even during preview so we can get users errors prior to a real update. #135

Closed

lukehoban mentioned this issue Jan 5, 2021

Docker image build always shows a resource to be updated #232

Closed

mikhailshilkov unassigned stack72 Jan 27, 2023

AaronFriel added the 4.x.x label Feb 13, 2023

AaronFriel added kind/enhancement Improvements or new features resolution/fixed This issue was fixed labels Mar 9, 2023

AaronFriel closed this as completed Mar 9, 2023

mikhailshilkov added this to the 0.85 milestone Mar 9, 2023

susanev assigned guineveresaenger Mar 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pulumi thinks it needs to replace ECS service and task definition even though no changes #23

Pulumi thinks it needs to replace ECS service and task definition even though no changes #23

kennyjwilli commented Sep 27, 2018

CyrusNajmabadi commented Oct 15, 2018

kennyjwilli commented Oct 15, 2018

CyrusNajmabadi commented Oct 15, 2018

joeduffy commented Oct 15, 2018

hausdorff commented Oct 15, 2018

ericrudder commented Oct 15, 2018 via email

joeduffy commented Oct 15, 2018

hausdorff commented Oct 15, 2018

joeduffy commented Oct 15, 2018

CyrusNajmabadi commented Oct 18, 2018

CyrusNajmabadi commented Oct 18, 2018

CyrusNajmabadi commented Oct 19, 2018

CyrusNajmabadi commented Oct 19, 2018

CyrusNajmabadi commented Oct 19, 2018

joeduffy commented Oct 25, 2018

lukehoban commented Oct 26, 2018

CyrusNajmabadi commented Oct 26, 2018

lukehoban commented Mar 9, 2019

ameier38 commented Sep 20, 2019

Blitz2145 commented Nov 15, 2019

AaronFriel commented Mar 9, 2023

Pulumi thinks it needs to replace ECS service and task definition even though no changes #23

Pulumi thinks it needs to replace ECS service and task definition even though no changes #23

Comments

kennyjwilli commented Sep 27, 2018

CyrusNajmabadi commented Oct 15, 2018

kennyjwilli commented Oct 15, 2018

CyrusNajmabadi commented Oct 15, 2018

joeduffy commented Oct 15, 2018

hausdorff commented Oct 15, 2018

ericrudder commented Oct 15, 2018 via email

joeduffy commented Oct 15, 2018

hausdorff commented Oct 15, 2018

joeduffy commented Oct 15, 2018

CyrusNajmabadi commented Oct 18, 2018

CyrusNajmabadi commented Oct 18, 2018

CyrusNajmabadi commented Oct 19, 2018

CyrusNajmabadi commented Oct 19, 2018

CyrusNajmabadi commented Oct 19, 2018

joeduffy commented Oct 25, 2018

lukehoban commented Oct 26, 2018

CyrusNajmabadi commented Oct 26, 2018

lukehoban commented Mar 9, 2019

ameier38 commented Sep 20, 2019

Blitz2145 commented Nov 15, 2019

AaronFriel commented Mar 9, 2023