-
Notifications
You must be signed in to change notification settings - Fork 545
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
remote.Write unbounded memory use #1123
Comments
I believe you're running into #413 We're essentially compressing everything twice because we compute the digest/size on the first pass, then recompress when we're actually uploading it. This is sub-optimal, which is why The easiest thing to do might be just saving the image tarballs differently? Do you have control of these image tarballs? How are they being produced? If the layers are compressed in the tarball, we can avoid compressing them at all! I ended up doing something similar for this bazelbuild/rules_docker#1696 A slightly harder thing to do would be refactoring the |
Interesting, thank you. So I could copy each layer from the tarball to a stream.Layer using Uncompressed() and write them to the registry instead? Is there an example to reproduce somewhere? I will use the content of remote.Write otherwise We typically use a docker save command to create these tarballs, which does not allow us to compress the layers. The first solution looks preferable to me! |
OK I got you wrong I guess, using a stream.Layer backed by a tarball layer implementation does not change the problem. I understand that I still have to rewrite the tarball package to replace the layer implementation, or find a way to create compressed tarballs. |
Actually since pushing stream.Layers still uses compression at least once we would only partially fix the issue. Given the profile above, it seems that gzip.Write calls have declared around 80 flate.Writers in total (40MB and a zero flate.Writer is around 500kb), which means 8 fresh writers per layer (the image is 10 layers big), so that seems quite a lot. A solution could also consist in replacing standard compression packages with https://github.com/klauspost/compress which seems more efficient (I did not test it). For now I will look after a cli tool to create compressed image tarballs. |
I noticed I've been fooled by my profile not being updated on my latest tests and I cannot reproduce the graph above since then, the total memory only reaches 15MB. So my previous comment might be misleading. |
If you're going this far, I would look at using a different on-disk representation. While compressed layers in a I'd read via I realize that this just shifts the problem left a little bit, but that might be preferable? |
Hello ! sorry for the late answer, I followed your proposal and designed a CLI to export daemon images to layout archives, which can then be handled by our server without dealing with compression, thus making the process much faster. For anyone needing it, here is the function used to perform the export: func Save(imgTag, outDir string) error {
tag, err := name.NewTag(imgTag)
if err != nil {
return err
}
fmt.Printf("[oci-utils] Importing image %s from the docker daemon\n", imgTag)
img, err := daemon.Image(tag)
if err != nil {
return err
}
layoutDir := os.TempDir() + "/layout-" +
strings.ReplaceAll(tag.RepositoryStr(), "-", "") +
"-" + strings.ReplaceAll(tag.TagStr(), "-", "")
if dir, set := os.LookupEnv("LAYOUT_TMP_DIR"); set {
layoutDir = dir
}
layoutDir, err = filepath.Abs(layoutDir)
if err != nil {
return err
}
path, err := layout.Write(layoutDir, empty.Index)
if err != nil {
return err
}
defer os.RemoveAll(layoutDir)
err = path.AppendImage(img)
if err != nil {
return err
}
tarballPath := filepath.Join(outDir, "layout.tar")
fmt.Println("[oci-utils] Image will be saved to " + tarballPath)
target, err := os.Create(tarballPath)
if err != nil {
return err
}
defer target.Close()
sink := tar.NewWriter(target)
err = filepath.Walk(layoutDir,
func(path string, info os.FileInfo, err error) error {
if err != nil {
return err
}
header, err := tar.FileInfoHeader(info, info.Name())
if err != nil {
return err
}
header.Name = strings.TrimPrefix(path, layoutDir)
if err := sink.WriteHeader(header); err != nil {
return err
}
if info.IsDir() {
return nil
}
file, err := os.Open(path)
if err != nil {
return err
}
defer file.Close()
_, err = io.Copy(sink, file)
return err
})
if err != nil {
return err
}
fmt.Printf("[oci-utils] Saved %s to %s\n", imgTag, tarballPath)
return nil
} |
Hello,
First of all, thanks to the contributors for this well designed library.
We are designing a containerized server which imports images tarballs and pushes them to a registry. To make it reasonably scalable we'd need to prevent these uploads from exceeding 20-30 Mb memory each. I noticed though that the remote.Write process consumes quite a lot of memory and -which is more problematic- this amount of memory is not bounded and tends to increase with the size of the image, even when setting remote.WithJobs(1) option.
Here is a little program I used to test this:
Running this with a test.tar file of ~ 300 Mb consumes about 100Mb of memory as visible on the following memory profile :
(I truncated the top of the file)
Seeing this graph, it seems the issue has to do with the standard package compress/flate, right? I've seen this issue that might be related: golang/go#32371
FYI, I experimented a docker push using the official client within a dind container to compare, (although the process is different, since we push a daemon image and not a tarball but apart from that it should be the same I guess) and observed ~ 25 Mb memory increase while pushing a 1 Gb image. [EDIT the graph above is showing alloc_space i.e. total memory allocated and not instant memory peak so that's quite different]
Is there any approach or setting that would avoid dealing with compression or reduce the compression level, if that might help? Or do we need to rewrite the library using another compression algorithm?
thanks for the help!
The text was updated successfully, but these errors were encountered: