Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix docker nightly build #2643

Open
wants to merge 15 commits into
base: gz-sim9
Choose a base branch
from
Open

Conversation

ntfshard
Copy link
Contributor

@ntfshard ntfshard commented Oct 9, 2024

🦟 Bug fix

Fixes #

Summary

Clean build by instruction in docker/README.md not working. Base image builds fine, but the next step is failing.

Ubuntu Focal (20.04) is too old to build gz-sim9 branch. It doesn't contain required GZ packages.

E: Package 'libgz-cmake4-dev' has no installation candidate
E: Unable to locate package libgz-common6-dev
E: Unable to locate package libgz-fuel-tools10-dev
E: Unable to locate package libgz-math8-eigen3-dev
E: Unable to locate package libgz-plugin3-dev
E: Unable to locate package libgz-physics8-dev
E: Unable to locate package libgz-rendering9-dev
E: Unable to locate package libgz-transport14-dev
E: Unable to locate package libgz-gui9-dev
E: Unable to locate package libgz-msgs11-dev
E: Unable to locate package libgz-sensors9-dev
E: Unable to locate package libsdformat15-dev

GCC8 is a very old compiler, I believe installation and replacing a default compiler was done for Ubuntu 18 (with default GCC7). Switched to GCC11 (default for Ubuntu 22). IMHO I'd prefer not to bind to compiler version in this case, but I made it as close as possible
Minor improvements in Readme and image
AFAIU docker/run.bash script not working well on modern systems, but it's out of scope of this question

Slightly out of scope, but couple of months ago tried same for a gz-sim8 branch and had problem with 2 missing gz libraries. If this patch will be accepted, I'll do similar changes on that branch too

Checklist

  • Signed all commits for DCO
  • Added tests
  • Updated documentation (as needed)
  • Updated migration guide (as needed)
  • Consider updating Python bindings (if the library has them)
  • codecheck passed (See contributing)
  • All tests passed (See test coverage)
  • While waiting for a review on your PR, please help review another open pull request to support the maintainers

Note to maintainers: Remember to use Squash-Merge and edit the commit message to match the pull request summary while retaining Signed-off-by messages.

Signed-off-by: Maksim Derbasov <[email protected]>
@ntfshard ntfshard requested a review from mjcarroll as a code owner October 9, 2024 15:23
@github-actions github-actions bot added the 🏛️ ionic Gazebo Ionic label Oct 9, 2024
Signed-off-by: Maksim Derbasov <[email protected]>
@mjcarroll
Copy link
Contributor

Would it additionally make sense to introduce a CI job to make sure that this docker file can build on, for instance, a nightly basis?

@ntfshard
Copy link
Contributor Author

Would it additionally make sense to introduce a CI job to make sure that this docker file can build on, for instance, a nightly basis?

Not a big master of GitHub actions, could try I suppose

@ntfshard
Copy link
Contributor Author

ntfshard commented Oct 10, 2024

Would it additionally make sense to introduce a CI job to make sure that this docker file can build on, for instance, a nightly basis?

Not a big master of GitHub actions, could try I suppose

Maybe in another PR? It not fits very well widely spread pipelines AFAIS. Actually I planned to make fix for annoying bug #2506 which affects us. (and we use older version in which I'd like to propagate fix too).
[In other words, propagate this changes on previous branches: gz-sim8, gz-sim7 and made fix for mentioned bug above on this branch and propagate it on a same branches too. It sounds reasonable?]

@mjcarroll
Copy link
Contributor

Maybe in another PR?

Absolutely, would just be a nice addition to keep these dockerfiles fresh and not regressing.

It sounds reasonable?

Good with me.

@ntfshard
Copy link
Contributor Author

Ok. The last mystery, what happened in Jenkins. Looks like something goes wrong with couple of python tests. I can't try restart build there, don't have a permissions.

@iche033
Copy link
Contributor

iche033 commented Oct 10, 2024

Ok. The last mystery, what happened in Jenkins. Looks like something goes wrong with couple of python tests. I can't try restart build there, don't have a permissions.

the failing python tests are likely due to osrf/homebrew-simulation#2834 which is currently affecting all gz packages. So not the fault of this PR

@ntfshard
Copy link
Contributor Author

Thank you for your explanation. What is the next step for this PR?
And what is the right way to propagate this changes to other branches? Cherry-pick(commit after merge) or just implement same changes due to it still will be a merge conflict.

@ntfshard ntfshard requested a review from mjcarroll October 15, 2024 07:49
@iche033
Copy link
Contributor

iche033 commented Oct 15, 2024

yeah, once this is merged, you can cherry pick the changes for backports or use mergify, e.g. add the comment @mergifyio backport gz-sim<N>

@ntfshard
Copy link
Contributor Author

Ok, will try) Thank you!
But right now while CI is red, it can't be merged.

docker/Dockerfile.base Outdated Show resolved Hide resolved
docker/Dockerfile.nightly Outdated Show resolved Hide resolved
docker/scripts/install_common_deps.sh Outdated Show resolved Hide resolved
docker/README.md Outdated Show resolved Hide resolved
Signed-off-by: Maksim Derbasov <[email protected]>
docker/run.bash Outdated
@@ -40,7 +31,6 @@ docker run -it \
-v "/etc/localtime:/etc/localtime:ro" \
-v "/dev/input:/dev/input" \
--rm \
--gpus all \
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

after this changes this script started working fine on a Ubuntu 22.04 host

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is needed to get GPU acceleration when using NVIDIA GPUs. Maybe leave it as is and add a statement below to remove it if not using a GPU?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll roll-back. Not sure how to dispatch automatically it for a GPU/noGPU cases. In my environment (VirtualBox) it leads to error:
docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]].

Copy link
Contributor

@azeey azeey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes. I have a few more comments.

docker/run.bash Outdated
@@ -40,7 +31,6 @@ docker run -it \
-v "/etc/localtime:/etc/localtime:ro" \
-v "/dev/input:/dev/input" \
--rm \
--gpus all \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is needed to get GPU acceleration when using NVIDIA GPUs. Maybe leave it as is and add a statement below to remove it if not using a GPU?

@@ -18,17 +18,8 @@ ARGS=("$@")
# This is necessary so Gazebo can create a context for OpenGL rendering
# (even headless).
XAUTH=/tmp/.docker.xauth
if [ ! -f $XAUTH ]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this change necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, previous code not working, here is example on a system with empty /tmp:

xauth:  /tmp/.docker.xauth not writable, changes will be ignored
xauth: (argv):1:  unable to read any entries from file "(stdin)"
chmod: changing permissions of '/tmp/.docker.xauth': Operation not permitted
Authorization required, but no authorization protocol specified

and previous code somehow creates /tmp/.docker.xauth as a directory

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@j-rivero any thoughts on this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It still quite simple to verify. But results can be false positive (it will be mentioned above error, but gui will work) if xhost +local was executed before (not sure, maybe it should be local:root, it's still part of ancient magic)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just tried this locally and it didn't work for me. Running

XAUTH=/tmp/.docker.xauth
xauth nlist $DISPLAY | sed -e 's/^..../ffff/' | xauth -f $XAUTH nmerge -

I get

xauth:  file /tmp/.docker.xauth does not exist

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I double checked and both (old and new) implementations are working fine. AFAIR old version initially didn't work. Maybe xhost +x has some permanent effect(still applied after reboot)
Tried your experiment, just run a command outside of the script and got the same error message.

TBH, it's not my area of expertise, so I'll rollback it with minor change from :0 to $DISPLAY

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH, it's not my area of expertise, so I'll rollback it with minor change from :0 to $DISPLAY

Me neither. I usually use distrobox or rocker which handle these types of things for me. The change to $DISPLAY looks good.

docker/README.md Outdated
@@ -75,7 +75,7 @@ distribution using debians.
image of Gazebo Garden:

```
./build.bash gz ./Dockerfile.gz
./build.bash gz-ionic ./Dockerfile.gz
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, but now we need to update the instruction right above to say "Gazebo Ionic"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll roll-back it here to garden and lines below too, due to build.bash not working properly with Ionic AFAIU. It executes Dockerfile.gz, which is based on nvidia/opengl:1.2-glvnd-devel-ubuntu20.04 image.

@ntfshard
Copy link
Contributor Author

Would it additionally make sense to introduce a CI job to make sure that this docker file can build on, for instance, a nightly basis?

Added, thanks to @skorobogatydmitry for help

@ntfshard ntfshard requested review from iche033 and azeey November 25, 2024 06:21
@ntfshard
Copy link
Contributor Author

Folks, can we wrap-up this PR? This PR fixes nightly docker build, which seems almost nobody using.

@iche033 @mjcarroll @azeey


sudo apt-get install --no-install-recommends -y \
libbenchmark-dev \
libbenchmark1
libbenchmark1.8.3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can get rid of libbenchmark1.8.3. It should be automatically installed with libbenchmark-dev

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this entire libbenchmark dependency is redundant, It's compiling fine even w/o this installation.
But as I said earlier, initial idea was to keep this PR as same as possible with previous implementation.

I'd like to remove it completely, if no objections

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

libbenchmark is an optional dependency. There are tests that use it, but they're disabled if libbenchmark is not installed

if (GzBenchmark_FOUND)

So I think we should keep it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted, commend added

@@ -18,17 +18,8 @@ ARGS=("$@")
# This is necessary so Gazebo can create a context for OpenGL rendering
# (even headless).
XAUTH=/tmp/.docker.xauth
if [ ! -f $XAUTH ]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just tried this locally and it didn't work for me. Running

XAUTH=/tmp/.docker.xauth
xauth nlist $DISPLAY | sed -e 's/^..../ffff/' | xauth -f $XAUTH nmerge -

I get

xauth:  file /tmp/.docker.xauth does not exist

@ntfshard ntfshard requested a review from azeey December 13, 2024 17:30
Copy link

codecov bot commented Dec 13, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Please upload report for BASE (gz-sim9@f0cde05). Learn more about missing BASE report.

Additional details and impacted files
@@            Coverage Diff             @@
##             gz-sim9    #2643   +/-   ##
==========================================
  Coverage           ?   68.81%           
==========================================
  Files              ?      341           
  Lines              ?    33084           
  Branches           ?        0           
==========================================
  Hits               ?    22767           
  Misses             ?    10317           
  Partials           ?        0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Signed-off-by: Maksim Derbasov <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏛️ ionic Gazebo Ionic
Projects
Status: In review
Development

Successfully merging this pull request may close these issues.

4 participants