-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensure that the ros2 daemon is stopped before running system tests #460
Conversation
Signed-off-by: Chris Lalancette <[email protected]>
Signed-off-by: Chris Lalancette <[email protected]>
PR job is failing because we need a new release of rclcpp. I'll do that presently, but that shouldn't block this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Though this looks like an ok workaround, I wonder why there's a ros 2 daemon running in the background.
cmd=['ros2', 'daemon', 'stop'], | ||
name='daemon-stop', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder why we have a daemon running in the background in the first place.
This looks fine as a workaround, but all previous process that created a daemon should not be "leaking it".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can have something running in CI that detects when a test "leaks" a ros 2 daemon, but it doesn't sound like something trivial to implement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder why we have a daemon running in the background in the first place.
This looks fine as a workaround, but all previous process that created a daemon should not be "leaking it".
In other places in the codebase, we are careful to kill the daemon before running the tests. For instance: https://github.com/ros2/ros2cli/blob/f6c4a5b05d172f8156645b8fcf8bd2dfa23e7bb2/ros2node/test/test_cli_duplicate_node_names.py#L55 . From that perspective, it is good for each test to be "defensive", and make sure the environment is the way it should be before starting.
But we are not similarly careful about tests cleaning up after themselves. In that same example in ros2node, we kill the daemon on startup, but then we do not kill it at the end. Maybe we should enhance all of the tests do do that, but it is honestly not clear to me. Partly the problem is that starting up the daemon can be a side-effect of doing other things (like running a ros2
cli command), so it is not always clear which tests need to kill it at the end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but then we do not kill it at the end. Maybe we should enhance all of the tests do do that,
👍 I think that killing the daemon at the end of those tests would be a good idea.
Partly the problem is that starting up the daemon can be a side-effect of doing other things (like running a ros2 cli command), so it is not always clear which tests need to kill it at the end.
Yes, having a defensive ros2 daemon stop
before testing doesn't look bad, but it's also not clear which tests can be affected by the presence of a "leaked" daemon and which ones not.
So it sounds like we would actually like to run "ros2 daemon stop" before running each test.
@ros-pull-request-builder retest this please |
All right, green CI on Linux, Linux aarch64, and macOS. The yellow Windows failures are already known, so I'm going to go ahead and merge this one. |
While debugging some failures over in ros2/rmw#293, we came across a situation that seemed to make no sense. The change from Fast-RTPS -> CycloneDDS as the default was causing Connext builds to fail. After poking at it for a while, we think that an earlier running instance of the ros2 daemon is causing some incompatibilities for tests later on.
This PR changes it so that the test_communication tests always kill off the ros2 daemon before running tests, similar to how the ros2cli tests work.
Before:
After:
Full CI with this change in place: