proposal: archive/tar: support zero-copy reading/writing #70807

hanwen · 2024-12-12T14:06:16Z

Proposal Details

the container ecosystem (podman,docker) spends its days creating and consuming huge .tar files. There is potential for significant speed-up here by having the tar package use zero-copy file transport.

The change is straightforward, but involves an API change, so opening a proposal.

with the following change, tarring up a 2G file from tmpfs to tmpfs goes from 2.0s to 1.3s

diff -u /home/hanwen/vc/go/src/archive/tar/writer.go hacktar/writer.go
--- /home/hanwen/vc/go/src/archive/tar/writer.go	2024-08-22 14:56:29.586690369 +0200
+++ hacktar/writer.go	2024-12-12 15:01:22.150045055 +0100
@@ -9,6 +9,7 @@
 	"fmt"
 	"io"
 	"io/fs"
+	"log"
 	"path"
 	"slices"
 	"strings"
@@ -491,7 +492,7 @@
 //
 // TODO(dsnet): Re-export this when adding sparse file support.
 // See https://golang.org/issue/22735
-func (tw *Writer) readFrom(r io.Reader) (int64, error) {
+func (tw *Writer) ReadFrom(r io.Reader) (int64, error) {
 	if tw.err != nil {
 		return 0, tw.err
 	}
@@ -550,6 +551,16 @@
 }
 
 func (fw *regFileWriter) ReadFrom(r io.Reader) (int64, error) {
+	log.Println("hanwen")
+	if _, ok := fw.w.(io.ReaderFrom); ok {
+		n, err := io.Copy(fw.w, r)
+		if n > fw.nb {
+			return n, fmt.Errorf("read %d bytes, beyond max %d", n, fw.nb)
+		}
+		fw.nb -= n
+		return n, err
+	}
+
 	return io.Copy(struct{ io.Writer }{fw}, r)
 }

hanwen · 2024-12-12T14:15:13Z

A similar optimization exists for the reading side, of course.

ianlancetaylor · 2024-12-12T17:55:08Z

Just to spell it out, I believe that the API change here is to define a new method on archive/tar.Writer:

// ReadFrom implements [io.ReaderFrom].
func (tw *Writer) readFrom(r io.Reader) (int64, error)

Note that I think you could get a similar effect without the API change by writing

	if tw, ok := fw.w.(*Writer); ok {
		return tw.readFrom(r)
	}

CC @dsnet

hanwen · 2024-12-13T06:49:21Z

your suggestion certainly improves regFileWriter.ReadFrom, but nobody calls that unless Writer.ReadFrom is exported. Am I missing something?

hanwen · 2024-12-13T07:12:49Z

for the reader, this works.

 // TODO(dsnet): Re-export this when adding sparse file support.
 // See https://golang.org/issue/22735
-func (tr *Reader) writeTo(w io.Writer) (int64, error) {
+func (tr *Reader) WriteTo(w io.Writer) (int64, error) {
 	if tr.err != nil {
 		return 0, tr.err
 	}
@@ -688,6 +688,12 @@
 }
 
 func (fr *regFileReader) WriteTo(w io.Writer) (int64, error) {
+	_, ok1 := fr.r.(io.WriterTo)
+	wrf, ok2 := w.(io.ReaderFrom)
+	if ok1 && ok2 {
+		return wrf.ReadFrom(&io.LimitedReader{R: fr.r, N: fr.nb})
+	}
 	return io.Copy(w, struct{ io.Reader }{fr})
 }

ianlancetaylor · 2024-12-13T17:55:02Z

It's probably me that was missing something.

mvdan · 2024-12-14T00:57:59Z

cc @dsnet given the TODO above

Jorropo · 2024-12-14T20:42:48Z

Do we want to include logic to pad out the tar to align content files to the destination's blocksize if it's an *os.File in the writer ?
Performance improvements go from single digit x improvement to multiple thousands through reflink at the cost of making the exact bytes of the tar dependent on the output.

Tar natively pad to 512 :'(

go/src/archive/tar/format.go

Line 143 in e39e965

blockSize = 512 // Size of each block in a tar stream

hanwen · 2024-12-15T10:58:23Z

@Jorropo - Fascinating insight, thanks! I can confirm that on btrfs, if I set the blockSize to 4096, I can write a 2G tar file in 0.08s which is amazing.

Unfortunately, it appears that the block size is not variable in the tar format, so this needs to be done in a different way. Fortunately, one could simply add as many empty files to pad out the tar file to 4096 or whatever the destination block size is. This can be done without changing the tar package at all.

Jorropo · 2024-12-15T11:12:23Z

I am not suggesting we change that field, 512 is hardcoded part of the tar format.
If we want to do this we might be able to figure out a way to inject no-op fields parsers will ignore in order to bump in the correct 512 bucket to be 4KiB (or whatever) alligned.

hanwen added the Proposal label Dec 12, 2024

gopherbot added this to the Proposal milestone Dec 12, 2024

hanwen changed the title ~~proposal: archive/tar: support zero-copy writing~~ proposal: archive/tar: support zero-copy reading/writing Dec 12, 2024

hanwen-flow mentioned this issue Dec 12, 2024

checkpoint rootfs diff is wasteful containers/podman#24826

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

proposal: archive/tar: support zero-copy reading/writing #70807

proposal: archive/tar: support zero-copy reading/writing #70807

hanwen commented Dec 12, 2024 •

edited by Jorropo

Loading

hanwen commented Dec 12, 2024

ianlancetaylor commented Dec 12, 2024

hanwen commented Dec 13, 2024

hanwen commented Dec 13, 2024

ianlancetaylor commented Dec 13, 2024

mvdan commented Dec 14, 2024

Jorropo commented Dec 14, 2024

hanwen commented Dec 15, 2024

Jorropo commented Dec 15, 2024

proposal: archive/tar: support zero-copy reading/writing #70807

proposal: archive/tar: support zero-copy reading/writing #70807

Comments

hanwen commented Dec 12, 2024 • edited by Jorropo Loading

Proposal Details

hanwen commented Dec 12, 2024

ianlancetaylor commented Dec 12, 2024

hanwen commented Dec 13, 2024

hanwen commented Dec 13, 2024

ianlancetaylor commented Dec 13, 2024

mvdan commented Dec 14, 2024

Jorropo commented Dec 14, 2024

hanwen commented Dec 15, 2024

Jorropo commented Dec 15, 2024

hanwen commented Dec 12, 2024 •

edited by Jorropo

Loading