Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better support for remote caching #33

Closed
geekflyer opened this issue Oct 30, 2018 · 7 comments
Closed

Better support for remote caching #33

geekflyer opened this issue Oct 30, 2018 · 7 comments
Assignees
Labels
4.x.x customer/feedback Feedback from customers kind/enhancement Improvements or new features resolution/fixed This issue was fixed
Milestone

Comments

@geekflyer
Copy link

Related slack discussion: https://pulumi-community.slack.com/archives/C84L4E3N1/p1540854501162300

In order to speed up builds, both locally and in CI, it is advisable to use images that have been previously pushed to a remote registry as cache source.
There's a bunch of blogs which describe the general technique, i.e. https://medium.com/@gajus/making-docker-in-docker-builds-x2-faster-using-docker-cache-from-option-c01febd8ef84 .
Doing so does not only usually drastically speed up builds, it also speeds up deployment to target machines (i.e. a kubernetes cluster) because using the --cache-from technique produces images which have more common layers that are more likely to be present on a clusters' local docker cache upon deployment.

A very simple strategy is to tag images always with latest and use this tag as the --cache-from source. Prior to #31 it was possible to use this strategy with pulumi-docker. Since #31 this strategy is not possible anymore.

So this issue is about getting back the ability to use this or some alternative good remote caching strategy.

Here's a suggestion for a more advanced strategy that should speed up builds even more than using the latest tag as cache source:

  1. tag and push each image with the git_sha of HEAD
  2. before each build pulumi should attempt to pull <image_name>:<git_sha_HEAD>. If it could successfully pull this, use it in --cache-from. If not successful, attempt to pull <image_name>:<git_sha_HEAD~1> and use this as --cache-from (basically attempt to get an image that was built from an ancestor commit). Repeat this process until an image could be pulled successfully (maybe stop doing so after 5 iterations or until HEAD~5).

TODO:

  • This needs probably some additional thought for determining what tag to push if the git working directory was dirty at the time of build.
  • In any case this strategy just works with git repos (so pulumi should not fail if it's not a git repo or this strategy should be made opt-in).

In either case pulumi-docker probably needs the ability to add and push an image with multiple tags in order to support #31 and remote caching at the same time.

@geekflyer
Copy link
Author

This issue is related to #32 and #14

@joeduffy joeduffy added this to the 0.19 milestone Oct 30, 2018
@lukehoban lukehoban assigned CyrusNajmabadi and unassigned hausdorff Nov 7, 2018
@CyrusNajmabadi CyrusNajmabadi modified the milestones: 0.19, 0.20 Nov 15, 2018
@CyrusNajmabadi
Copy link
Contributor

We're going to take on the docker work in m20.

@lukehoban lukehoban added the customer/feedback Feedback from customers label Nov 16, 2018
@geekflyer
Copy link
Author

heyho, small update from my side: I tried out recently https://github.com/GoogleContainerTools/kaniko which actually has explicit support for intelligent remote caching (without manual tag management as with docker build): https://github.com/GoogleContainerTools/kaniko#caching-layers. I was able to run it locally just fine and the caching seems to work pretty well with a remote cache (it just uses a docker registry as cache source) and obviously it is also made to run smoothly in a k8s cluster.

In summary I think instead of the whole complicated cache optimizations that I've suggested above (when using docker build) it may be easier and better to simply build a pulumi wrapper around kaniko. For reference: Skaffold also supports kaniko as alternative builder tool (and I guess a few others do too).

Last but not least a couple of days https://github.com/uber/makisu came out which seems somewhat similar (but more immature) - it also has built-in support for intelligent remote caching.

In summary if image.Docker would just use (or support) either kaniko or makiso I would probably prefer to use them anyways over a complicated docker.build wrapper.

@geekflyer
Copy link
Author

And one more observation from me: I tried out both kaniko and makisu now and must say that makisu seems to be much faster in most cases. Seems like it's snapshotting mechanism etc. is simply faster. So I prefer to integrate pulumi with makisu.

@lukehoban lukehoban modified the milestones: 0.20, 0.21 Jan 10, 2019
@hausdorff hausdorff modified the milestones: 0.21, 0.22 Feb 15, 2019
@lukehoban lukehoban modified the milestones: 0.22, 0.23 Mar 25, 2019
@hausdorff hausdorff removed this from the 0.23 milestone May 6, 2019
@Place1
Copy link

Place1 commented Sep 10, 2019

@hausdorff is adding kaniko support something you’d be keen to see as an open source contribution? If it’s something that would be merged I might give it a crack :)

@NakulK48
Copy link

Are there any updates on this? If you run a docker build from GitHub Actions directly, making use of the registry cache is as simple as:

cacheFrom: type=registry, ...

But no such behaviour seems to be possible using the cacheFrom in pulumi/docker https://www.pulumi.com/registry/packages/docker/api-docs/image/#cachefrom

So although local builds seem to make use of the local docker cache and are reasonably fast, CI builds end up being very slow.

@AaronFriel
Copy link
Contributor

@geekflyer I'm closing this tentatively as resolved with the new implementation of the Docker Image resource in v4! See our blog post for more info: https://www.pulumi.com/blog/build-images-50x-faster-docker-v4/

We do know that that multi-stage builds can be an issue:

Please create a new issue if you find that the cachefrom option is not working. Setting the BUILDKIT_INLINE_CACHE argument should not be necessary in recent versions of the Docker engine, but it doesn't hurt to add it.

@AaronFriel AaronFriel added kind/enhancement Improvements or new features resolution/fixed This issue was fixed labels Mar 9, 2023
@mikhailshilkov mikhailshilkov added this to the 0.85 milestone Mar 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
4.x.x customer/feedback Feedback from customers kind/enhancement Improvements or new features resolution/fixed This issue was fixed
Projects
None yet
Development

No branches or pull requests