Better support for remote caching #33

geekflyer · 2018-10-30T00:05:19Z

Related slack discussion: https://pulumi-community.slack.com/archives/C84L4E3N1/p1540854501162300

In order to speed up builds, both locally and in CI, it is advisable to use images that have been previously pushed to a remote registry as cache source.
There's a bunch of blogs which describe the general technique, i.e. https://medium.com/@gajus/making-docker-in-docker-builds-x2-faster-using-docker-cache-from-option-c01febd8ef84 .
Doing so does not only usually drastically speed up builds, it also speeds up deployment to target machines (i.e. a kubernetes cluster) because using the --cache-from technique produces images which have more common layers that are more likely to be present on a clusters' local docker cache upon deployment.

A very simple strategy is to tag images always with latest and use this tag as the --cache-from source. Prior to #31 it was possible to use this strategy with pulumi-docker. Since #31 this strategy is not possible anymore.

So this issue is about getting back the ability to use this or some alternative good remote caching strategy.

Here's a suggestion for a more advanced strategy that should speed up builds even more than using the latest tag as cache source:

tag and push each image with the git_sha of HEAD
before each build pulumi should attempt to pull <image_name>:<git_sha_HEAD>. If it could successfully pull this, use it in --cache-from. If not successful, attempt to pull <image_name>:<git_sha_HEAD~1> and use this as --cache-from (basically attempt to get an image that was built from an ancestor commit). Repeat this process until an image could be pulled successfully (maybe stop doing so after 5 iterations or until HEAD~5).

TODO:

This needs probably some additional thought for determining what tag to push if the git working directory was dirty at the time of build.
In any case this strategy just works with git repos (so pulumi should not fail if it's not a git repo or this strategy should be made opt-in).

In either case pulumi-docker probably needs the ability to add and push an image with multiple tags in order to support #31 and remote caching at the same time.

The text was updated successfully, but these errors were encountered:

geekflyer · 2018-10-30T00:07:18Z

This issue is related to #32 and #14

CyrusNajmabadi · 2018-11-15T20:19:08Z

We're going to take on the docker work in m20.

geekflyer · 2018-11-20T04:42:07Z

heyho, small update from my side: I tried out recently https://github.com/GoogleContainerTools/kaniko which actually has explicit support for intelligent remote caching (without manual tag management as with docker build): https://github.com/GoogleContainerTools/kaniko#caching-layers. I was able to run it locally just fine and the caching seems to work pretty well with a remote cache (it just uses a docker registry as cache source) and obviously it is also made to run smoothly in a k8s cluster.

In summary I think instead of the whole complicated cache optimizations that I've suggested above (when using docker build) it may be easier and better to simply build a pulumi wrapper around kaniko. For reference: Skaffold also supports kaniko as alternative builder tool (and I guess a few others do too).

Last but not least a couple of days https://github.com/uber/makisu came out which seems somewhat similar (but more immature) - it also has built-in support for intelligent remote caching.

In summary if image.Docker would just use (or support) either kaniko or makiso I would probably prefer to use them anyways over a complicated docker.build wrapper.

geekflyer · 2018-12-09T15:12:25Z

And one more observation from me: I tried out both kaniko and makisu now and must say that makisu seems to be much faster in most cases. Seems like it's snapshotting mechanism etc. is simply faster. So I prefer to integrate pulumi with makisu.

Place1 · 2019-09-10T12:29:35Z

@hausdorff is adding kaniko support something you’d be keen to see as an open source contribution? If it’s something that would be merged I might give it a crack :)

NakulK48 · 2023-02-28T11:55:25Z

Are there any updates on this? If you run a docker build from GitHub Actions directly, making use of the registry cache is as simple as:

cacheFrom: type=registry, ...

But no such behaviour seems to be possible using the cacheFrom in pulumi/docker https://www.pulumi.com/registry/packages/docker/api-docs/image/#cachefrom

So although local builds seem to make use of the local docker cache and are reasonably fast, CI builds end up being very slow.

AaronFriel · 2023-03-09T01:23:23Z

@geekflyer I'm closing this tentatively as resolved with the new implementation of the Docker Image resource in v4! See our blog post for more info: https://www.pulumi.com/blog/build-images-50x-faster-docker-v4/

We do know that that multi-stage builds can be an issue:

Support multi-stage inline layer caching in BuildKit operation #520

Please create a new issue if you find that the cachefrom option is not working. Setting the BUILDKIT_INLINE_CACHE argument should not be necessary in recent versions of the Docker engine, but it doesn't hurt to add it.

joeduffy assigned hausdorff Oct 30, 2018

joeduffy added this to the 0.19 milestone Oct 30, 2018

lukehoban assigned CyrusNajmabadi and unassigned hausdorff Nov 7, 2018

CyrusNajmabadi modified the milestones: 0.19, 0.20 Nov 15, 2018

lukehoban added the customer/feedback Feedback from customers label Nov 16, 2018

lukehoban modified the milestones: 0.20, 0.21 Jan 10, 2019

lukehoban assigned hausdorff and unassigned CyrusNajmabadi Jan 10, 2019

hausdorff modified the milestones: 0.21, 0.22 Feb 15, 2019

lukehoban modified the milestones: 0.22, 0.23 Mar 25, 2019

hausdorff removed this from the 0.23 milestone May 6, 2019

lukehoban assigned stack72 and unassigned hausdorff Nov 27, 2019

mikhailshilkov unassigned stack72 Jan 27, 2023

AaronFriel added the 4.x.x label Feb 13, 2023

AaronFriel added kind/enhancement Improvements or new features resolution/fixed This issue was fixed labels Mar 9, 2023

AaronFriel closed this as completed Mar 9, 2023

mikhailshilkov added this to the 0.85 milestone Mar 9, 2023

susanev assigned guineveresaenger Mar 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Better support for remote caching #33

Better support for remote caching #33

geekflyer commented Oct 30, 2018

geekflyer commented Oct 30, 2018

CyrusNajmabadi commented Nov 15, 2018

geekflyer commented Nov 20, 2018

geekflyer commented Dec 9, 2018

Place1 commented Sep 10, 2019

NakulK48 commented Feb 28, 2023

AaronFriel commented Mar 9, 2023

Better support for remote caching #33

Better support for remote caching #33

Comments

geekflyer commented Oct 30, 2018

geekflyer commented Oct 30, 2018

CyrusNajmabadi commented Nov 15, 2018

geekflyer commented Nov 20, 2018

geekflyer commented Dec 9, 2018

Place1 commented Sep 10, 2019

NakulK48 commented Feb 28, 2023

AaronFriel commented Mar 9, 2023