-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
exporter: support resetting timestamp for determinism #1058
Comments
We also need to consider gzip determinism For small images we might be able to just push the image without gzip and call it a day |
If we only reset times in differ it is little unsafe because snapshots in local build cache and remote build cache will have different timestamps. But if we reset in snapshot it will be 1) slow 2) confuse the containerd naive differ. I guess the first is fine if this is opt-in from the user and clearly marked as an exporter feature. With a custom (eg. fuse based) snapshotter+differ we could do this without the above limitations as well. |
Description The Reproducible Builds guideline points that build tools should make build timestamp configurable https://reproducible-builds.org/docs/source-date-epoch/ While building docker image with There is this cool blogpost that describes the simplest ever docker image ending up having different digests when running on two different hosts or without cache. Workarounds Describe the results you expected: This would enable to reuse layers cache while building docker images from various build tools plugins like this sbt/sbt-native-packager#1321 or described here |
I'm fine with making the "created" configurable (it is already stable in buildkit if you get cache for a layer) but that on its own doesn't really solve this issue. The files generated in run commands still cause timestamps and gzip may not be deterministic. Resetting timestamps in snapshots is quite hard in current implementations. as explained before Some files are created by runc that is out of our control. |
Hi, Based on the previous answer, has this build flag to allow overriding just the "created" layer property rather than the layer files been implemented already ? That itself would already help the people who already do make layer storage stable already (ie not depending on current time). Cheers, |
This is a really important feature for reproducible images. Reproducercat <<EOF > Dockerfile
FROM alpine:3.5
COPY ./install /
CMD [ "/bin/sh", "hello.sh" ]
EOF
mkdir install
for i in $(seq 1 5); do
echo "echo $i" > install/hello.sh
touch -t 8001010000 install/hello.sh
docker buildx build --progress plain --tag reproducer:latest .
echo --------------------------------------------------
echo expect $i
docker run --rm reproducer:latest
echo --------------------------------------------------
done Output
|
In LLB level we do support overwriting timestamp on file operations(copy, mkdir etc). So for the last example we could add a flag in |
I'd want the timestamp to be a "no-later-than" value. So if there are files in my base layers with earlier timestamps they should not be modified. That would also allow the cache to be reused when a cache step with an equal or earlier timestamp is seen. And if it finds a cache entry with a later timestamp, I'd find it acceptable to exclude that or recreate it with the earlier timestamp. The workflow I imagine is either setting the timestamp to the git commit time of the repo I'm building, the timestamp set in a label on an image I'm reproducing, or set it to an effective zero value like Jan 1, 1970. In the first two cases, if I'm being reproducible, my base image would be pinned to something that exists before that git commit was created. And in the latter case, we may have a parallel cache for layers where the timestamps have been stripped. |
No description provided.
The text was updated successfully, but these errors were encountered: