Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix dockerfile for druid image #15264

Merged
merged 3 commits into from
Nov 1, 2023

Conversation

georgew5656
Copy link
Contributor

@georgew5656 georgew5656 commented Oct 26, 2023

Fixes docker image build issues with apache/druid.

Description

Currently, trying to build the docker image fails with the following error when trying to grab the static bash utils.

#16 0.263 wget: note: TLS certificate validation not implemented
#16 0.297 wget: TLS error from peer (alert code 80): 80
#16 0.297 wget: error getting response: Connection reset by peer

This seems to be caused by a SSL validation issue in busybox (it seems to happen for github.com but not google.com). I thought this might be fixed by moving to busybox 1.35.0, but I got the same error.

To get this working, I moved the wget commands to another setup layer and copied the file into the distroless build.

It seems like it would be better to switch to busybox 1.35.0 anyways (there is a critical CVE: https://nvd.nist.gov/vuln/detail/CVE-2022-48174), although we technically just use the image for the setup tools, not sure if the CVE actually affects us), but the issue with switching to busybox 1.35.0 is that we get the build errors that were described in this pr: #14518

This can be remediated by also upgrading the distroless image to debian12, but there is no java11-debian12 distroless image, only a java17-debian12 image.

As far as I know druid supports java17 as a target now so maybe this would be okay? I can make that change if people think it makes sense.

I tested the new image and it ran fine.

Release note

Fix druid docker image

Key changed/added classes in this PR
  • distribution/docker/Dockerfile

This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • a release note entry in the PR description.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

@abhishekagarwal87
Copy link
Contributor

@hardikbajaj - FYI

@abhishekagarwal87
Copy link
Contributor

I am ok moving to Java 17 based image.

Copy link
Contributor

@petermarshallio petermarshallio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QQ on the need for --platform

@@ -17,7 +17,7 @@
# under the License.
#

ARG JDK_VERSION=11
ARG JDK_VERSION=17

# The platform is explicitly specified as x64 to build the Druid distribution.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @georgew5656! I was having real issues getting this to work on my M2 ;)

Is --platform still required here?

Copy link
Contributor Author

@georgew5656 georgew5656 Oct 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't think this druid image actually supports multiarch images right now, when I go to the druid dockerhub i only see a amd64 image which would explain why it doesn't work on a arm based mac

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@abhishekagarwal87 do you know how this image gets built? is it a pipeline somewhere?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's a bit of black box for me as well. Apache infra jobs builds and pushes the image as far as I know.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I shoulda said (!) was I managed to build an arm Docker image using your fix - but yeah the official image would be nicer :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked the dockerhub and the official image is being built right now from a dockerhub automated build on the tags in github:

Screen Shot 2023-10-31 at 12 55 15 PM

Copy link
Contributor

@jon-wei jon-wei Oct 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see a way to pass in a target platform for the automated build, but it seems like it'd be possible to have another Dockerfile configured for arm64, and then we could add a new automated build rule that reads from that instead

Screen Shot 2023-10-31 at 1 07 27 PM

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option might be to move the docker image building to a github action on tags only, have some action that calls the appropriate buildx command, and pushes a multi-arch image to dockerhub

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The automated dockerhub builds seem nice because then we don't have to worry about dockerhub credentials in our GHA stuff.

RUN ["/busybox/busybox", "--install", "/bin"]

# Predefined builtin arg, see: https://docs.docker.com/engine/reference/builder/#automatic-platform-args-in-the-global-scope
FROM alpine as bash-static
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the new usage of alpine for? Would it make sense to pin this to a specific version so the build is more reproducible?

Copy link
Contributor Author

@georgew5656 georgew5656 Oct 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i used alpine as the layer to copy the files instead of busybox because wget is broken in busybox and alpine is a simple image to run a wget command, i can pin the image tag to something

since busybox is based on alpine i figured this was the best image to get this working without introducing a bunch of other vuln scan stuff

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, TY for the explanation.

@@ -17,7 +17,7 @@
# under the License.
#

ARG JDK_VERSION=11
ARG JDK_VERSION=17

# The platform is explicitly specified as x64 to build the Druid distribution.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can raise an Infra ticket, or ask in ASF Slack, to adjust how the images get built. (Or at least ask how they get built.) If it's possible to publish ARM images too that would be nice.

@@ -17,7 +17,7 @@
# under the License.
#

ARG JDK_VERSION=11
ARG JDK_VERSION=17
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am good with this, and in fact I prefer it 🙂

@abhishekagarwal87 abhishekagarwal87 merged commit 49e0cba into apache:master Nov 1, 2023
82 checks passed
@abhishekagarwal87
Copy link
Contributor

@LakshSingla - We may need this for 28 to build docker images.

LakshSingla pushed a commit to LakshSingla/druid that referenced this pull request Nov 6, 2023
Fixes docker image build issues with apache/druid.
LakshSingla pushed a commit that referenced this pull request Nov 6, 2023
Fixes docker image build issues with apache/druid.
CaseyPan pushed a commit to CaseyPan/druid that referenced this pull request Nov 17, 2023
Fixes docker image build issues with apache/druid.
@LakshSingla LakshSingla added this to the 29.0.0 milestone Jan 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants