Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make Docker build smarter, add Dockerfile.debian #6344

Merged
merged 5 commits into from
Dec 18, 2022

Conversation

elee1766
Copy link
Contributor

@elee1766 elee1766 commented Dec 16, 2022

this PR makes many changes to the dockerfile in hopes of making it faster to build, download, and upload.

  1. Instead of copying the entire repository at once, it first copies the go.mod and go.sum files, then runs go mod download. This allows the dependencies to exist in their own layer, avoiding the need for the build cache there.

  2. the compilation of the db-tools is moved to a second image. Since these are not often changed, not needing to rebuild them every time makes things a lot faster for local development. It also reduces the amount that is needed to be uploaded when creating new release - since the db-tools layer will be unchanged

  3. each binary is copied individually into its own layer. This allows docker to upload/download each binary in parallel, along with better recovery if the download of the existing 500mb layer fails (since it is done in parts)

it also adds a second dockerfile which builds erigon with a debian image, as a start to addressing #6255

while this dockerfile has a greater total image size, the total size of different layers across versions will be smaller, resulting in smaller effective upload & download sizes

with all that said - I am not really sure how the existing erigon ci/release process works, so maybe these changes are incompatible with it.

comparison

docker build speed

in both examples, i build erigon, then change a file in core/blockchain.go (resulting in recompilation)

these are the produced logs

CURRENT DOCKERFILE

[+] Building 70.1s (18/18) FINISHED
 => [internal] load build definition from Dockerfile                                                                     0.1s
 => => transferring dockerfile: 38B                                                                                      0.0s
 => [internal] load .dockerignore                                                                                        0.2s
 => => transferring context: 34B                                                                                         0.0s
 => resolve image config for docker.io/docker/dockerfile:1.2                                                             0.4s
 => CACHED docker-image://docker.io/docker/dockerfile:1.2@sha256:e2a8561e419ab1ba6b2fe6cbdf49fd92b95912df1cf7d313c3e223  0.0s
 => [internal] load metadata for docker.io/library/alpine:3.16                                                           0.4s
 => [internal] load metadata for docker.io/library/golang:1.19-alpine3.16                                                0.4s
 => [builder 1/5] FROM docker.io/library/golang:1.19-alpine3.16@sha256:4b4f7127b01b372115ed9054abc6de0a0b3fdea224561b35  0.0s
 => [stage-1 1/5] FROM docker.io/library/alpine:3.16@sha256:b95359c2505145f16c6aa384f9cc74eeff78eb36d308ca4fd902eeeb0a0  0.0s
 => [internal] load build context                                                                                        0.1s
 => => transferring context: 111.58kB                                                                                    0.0s
 => CACHED [builder 2/5] RUN apk --no-cache add build-base linux-headers git bash ca-certificates libstdc++              0.0s
 => CACHED [builder 3/5] WORKDIR /app                                                                                    0.0s
 => [builder 4/5] ADD . .                                                                                                0.5s
 => [builder 5/5] RUN --mount=type=cache,target=/root/.cache     --mount=type=cache,target=/tmp/go-build     --mount=t  61.3s
 => CACHED [stage-1 2/5] RUN apk add --no-cache ca-certificates curl libstdc++ jq tzdata                                 0.0s
 => [stage-1 3/5] COPY --from=builder /app/build/bin/* /usr/local/bin/                                                   0.2s
 => [stage-1 4/5] RUN adduser -D -u 1000 -g 1000 erigon                                                                  0.8s
 => [stage-1 5/5] RUN mkdir -p ~/.local/share/erigon                                                                     1.0s
 => exporting to image                                                                                                   2.6s
 => => exporting layers                                                                                                  2.6s
 => => writing image sha256:948c68e8d2f64df2c4fa758a370b8de8c4aab65c91c3aeca96662ec8eafb7815                             0.0s

Since the downloading of dependencies is in the cache - rebuild time does not suffer, but notice that it does not go into its own layer.

More importantly, since the db-tools are being rebuilt every time, an extra 10-20s is added to the docker build time.

NEW DOCKERFILE:

+] Building 52.6s (50/50) FINISHED
 => [internal] load build definition from Dockerfile                                                  0.3s
 => => transferring dockerfile: 38B                                                                   0.0s
 => [internal] load .dockerignore                                                                     0.2s
 => => transferring context: 34B                                                                      0.0s
 => resolve image config for docker.io/docker/dockerfile:1.2                                          0.4s
 => CACHED docker-image://docker.io/docker/dockerfile:1.2@sha256:e2a8561e419ab1ba6b2fe6cbdf49fd92b95  0.0s
 => [internal] load metadata for docker.io/library/alpine:3.16                                        0.5s
 => [internal] load metadata for docker.io/library/golang:1.19-alpine3.16                             0.5s
 => [tools-builder 1/9] FROM docker.io/library/golang:1.19-alpine3.16@sha256:4b4f7127b01b372115ed905  0.0s
 => [internal] load build context                                                                     0.1s
 => => transferring context: 279.70kB                                                                 0.0s
 => [stage-2  1/28] FROM docker.io/library/alpine:3.16@sha256:b95359c2505145f16c6aa384f9cc74eeff78eb  0.0s
 => CACHED [tools-builder 2/9] RUN apk --no-cache add build-base linux-headers git bash ca-certifica  0.0s
 => CACHED [tools-builder 3/9] WORKDIR /app                                                           0.0s
 => CACHED [builder 4/8] ADD go.mod go.mod                                                            0.0s
 => CACHED [builder 5/8] ADD go.sum go.sum                                                            0.0s
 => CACHED [builder 6/8] RUN go mod download                                                          0.0s
 => [builder 7/8] ADD . .                                                                             0.6s
 => [builder 8/8] RUN --mount=type=cache,target=/root/.cache     --mount=type=cache,target=/tmp/go-  39.7s
 => CACHED [stage-2  2/28] RUN apk add --no-cache ca-certificates libstdc++ tzdata                    0.0s
 => CACHED [stage-2  3/28] RUN apk add --no-cache curl jq bind-tools                                  0.0s
 => CACHED [stage-2  4/28] RUN adduser -D -u 1000 -g 1000 erigon                                      0.0s
 => CACHED [stage-2  5/28] RUN mkdir -p ~/.local/share/erigon                                         0.0s
 => CACHED [tools-builder 4/9] ADD Makefile Makefile                                                  0.0s
 => CACHED [tools-builder 5/9] ADD tools.go tools.go                                                  0.0s
 => CACHED [tools-builder 6/9] ADD go.mod go.mod                                                      0.0s
 => CACHED [tools-builder 7/9] ADD go.sum go.sum                                                      0.0s
 => CACHED [tools-builder 8/9] RUN mkdir -p /app/build/bin                                            0.0s
 => CACHED [tools-builder 9/9] RUN make db-tools                                                      0.0s
 => CACHED [stage-2  6/28] COPY --from=tools-builder /app/build/bin/mdbx_chk /usr/local/bin/mdbx_chk  0.0s
 => CACHED [stage-2  7/28] COPY --from=tools-builder /app/build/bin/mdbx_copy /usr/local/bin/mdbx_co  0.0s
 => CACHED [stage-2  8/28] COPY --from=tools-builder /app/build/bin/mdbx_drop /usr/local/bin/mdbx_dr  0.0s
 => CACHED [stage-2  9/28] COPY --from=tools-builder /app/build/bin/mdbx_dump /usr/local/bin/mdbx_du  0.0s
 => CACHED [stage-2 10/28] COPY --from=tools-builder /app/build/bin/mdbx_load /usr/local/bin/mdbx_lo  0.0s
 => CACHED [stage-2 11/28] COPY --from=tools-builder /app/build/bin/mdbx_stat /usr/local/bin/mdbx_st  0.0s
 => [stage-2 12/28] COPY --from=builder /app/build/bin/devnet /usr/local/bin/devnet                   0.4s
 => [stage-2 13/28] COPY --from=builder /app/build/bin/downloader /usr/local/bin/downloader           0.5s
 => [stage-2 14/28] COPY --from=builder /app/build/bin/erigon /usr/local/bin/erigon                   0.5s
 => [stage-2 15/28] COPY --from=builder /app/build/bin/erigon-cl /usr/local/bin/erigon-cl             0.5s
 => [stage-2 16/28] COPY --from=builder /app/build/bin/evm /usr/local/bin/evm                         0.4s
 => [stage-2 17/28] COPY --from=builder /app/build/bin/hack /usr/local/bin/hack                       0.4s
 => [stage-2 18/28] COPY --from=builder /app/build/bin/integration /usr/local/bin/integration         0.4s
 => [stage-2 19/28] COPY --from=builder /app/build/bin/lightclient /usr/local/bin/lightclient         0.5s
 => [stage-2 20/28] COPY --from=builder /app/build/bin/observer /usr/local/bin/observer               0.4s
 => [stage-2 21/28] COPY --from=builder /app/build/bin/pics /usr/local/bin/pics                       0.4s
 => [stage-2 22/28] COPY --from=builder /app/build/bin/rpcdaemon /usr/local/bin/rpcdaemon             0.4s
 => [stage-2 23/28] COPY --from=builder /app/build/bin/rpctest /usr/local/bin/rpctest                 0.4s
 => [stage-2 24/28] COPY --from=builder /app/build/bin/sentinel /usr/local/bin/sentinel               0.3s
 => [stage-2 25/28] COPY --from=builder /app/build/bin/sentry /usr/local/bin/sentry                   0.4s
 => [stage-2 26/28] COPY --from=builder /app/build/bin/state /usr/local/bin/state                     0.5s
 => [stage-2 27/28] COPY --from=builder /app/build/bin/txpool /usr/local/bin/txpool                   0.5s
 => [stage-2 28/28] COPY --from=builder /app/build/bin/verkle /usr/local/bin/verkle                   0.5s
 => exporting to image                                                                                1.5s
 => => exporting layers                                                                               1.3s
 => => writing image sha256:7c577386242d539b77f45774ac2800dd449ffc9f187387a4a69ad0cd79fc9b04          0.0s
 => => naming to docker.io/library/erigon                                                             0.0s

since dependencies and db-tools versions didnt change - all those layers are cached, and did not need to rebuild/redownload

an additional advantage - build tools that are able to share cached layers (such as kaniko or gitlab runner) are able to share dependency layers automatically between runs, either sequential or concurrent, while using mounts are an extra piece that needs to be configured, and is not possible to share for concurrent builds

docker push/pull speed

see this example of the image pushing to a docker repo

CURRENT DOCKERFILE

The push refers to repository [cr.gfx.cafe/images/erigon/test]
51af77f8740b: Pushing  4.096kB
fb257f924975: Pushing [==================================================>]  11.78kB
9057ae9f6ad6: Pushing [>                                                  ]   17.8MB/962.8MB
0ffb38bafc9e: Pushing [=================================>                 ]  4.338MB/6.477MB
e5e13b0c77cb: Layer already exists

the existing image can only be uploaded in a single layer, and it is very big. if the upload fails part way through - the entire upload is aborted, and i must try again. It is the same with the download

new image

The push refers to repository [cr.gfx.cafe/images/erigon/test]
ac47c1bb87c6: Pushing [===========================>                       ]  7.866MB/14.22MB
2ba8ef6b2d4f: Pushing [==========>                                        ]  9.177MB/45.08MB
49ab36df341c: Pushing [===========>                                       ]  11.31MB/48.23MB
71f41bc3c4d4: Pushing [===========>                                       ]  15.57MB/68.94MB
7f40d9db27c5: Pushing [=====>                                             ]  8.325MB/82.98MB
20866e83eb57: Waiting
f15875fce722: Waiting
eb299c01a4b0: Waiting
b5f45cfe93d4: Waiting
15054c0c5515: Waiting
1b757dfa7311: Waiting
8e1176a93523: Waiting
53cf053c5cd7: Waiting
a88382869dce: Waiting
5380564abef3: Waiting
8b49a1ab1232: Waiting
774dcc434c98: Waiting
1568598ebd63: Waiting
6d1ef72c9409: Preparing
1fccdb04baaa: Waiting
2a3531caafa0: Waiting
51d43a55eebb: Waiting
b94f90c4bd95: Waiting
f49e2054b147: Waiting
65324ece5c8a: Waiting
5d448d0b43e8: Waiting
38c55858fb7a: Waiting
e5e13b0c77cb: Waiting

since the image is broken up into many small parts - the upload can happen in parallel, which is faster. Along with this, we can resume after a failure in upload, since we are uploading smaller chunks

FROM docker.io/library/golang:1.19-bullseye AS builder

RUN apt update
RUN apt install -y build-essential git bash ca-certificates libstdc++6
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure libstdc++6 is needed and why 6

Copy link
Contributor Author

@elee1766 elee1766 Dec 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i copied the deps from alpine - the matching dependency in terms of most compatibililty for libstdc++ is libstdc++6 (glibc6). Not 100% if it is needed for runtime, i assumed it was there in the alpine one for a reason though.

the other option in debian is libstdc++5 (glibc3.3)

make all


FROM docker.io/library/golang:1.19-alpine3.16 AS tools-builder
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are you sure an alpine here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think it would be best for tools to be build with same lib as where supposedly the db is created.

used the golang alpine so that i could easily use go mod vendor

WORKDIR /app

ADD Makefile Makefile
ADD tools.go tools.go
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this line is useless, because tools.go used to ensure binary dep is stored in go.mod

Copy link
Contributor Author

@elee1766 elee1766 Dec 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since i'm not copying the rest of the files, 'go mod vendor' will run 'go mod tidy' first, which cause those deps to be removed, unless tools.go file is there, so it only worked for me when i copied the file.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't see reason to call "go mod tidy" in docker.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'go mod vendor' which is used by the db-tools build process calls 'tidy'. not sure if there's a way to call vendor without tidy

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it

@AskAlexSharov AskAlexSharov merged commit f2467a7 into erigontech:devel Dec 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants