Enhancing log playback performance #351

adlarkin · 2020-09-15T16:51:12Z

I've changed how log playback works in order to increase RTF and remove jumpy playback.

Previously, when setting updated entity poses in the log file as a PeriodicChange, log playback was jumpy. This was because the SceneBroadcaster updates at a different rate than when the log file is parsed (60 Hz), so, if log parsing didn't occur during the same cycle as a publication from the SceneBroadcaster, changed poses from the latest log parse call were not captured in the rendering update. In order to get around this, the updated poses were saved as a OneTimeChange.

The issue with using OneTimeChange is that playback performance is noticeably slower for worlds with a lot of entities. The reason for this is because whenever there is a OneTimeChange, the SceneBroadcaster will publish the full state, even if only one pose was changed.

In order to maintain the performance that comes with PeriodicChange while removing the jumpy playback that arises from different parsing and scene update rates, I cached the changed poses from the latest log parse call as a PeriodicChange. Then, whenever it is time for a scene update, only the cached pose updates are rendered instead of the full scene.

Below shows the RTF difference when playing back a log at regular speed for a SubT world, which has a lot of entities. The first video is before caching (OneTimeChange), and the second video uses caching (PeriodicChange). A few things to note:

The RTF varies significantly using OneTimeChange, when it should stay close to 100% since we omit physics during log playback.
The RTF stays close to 100% when using caching and PeriodicChange, which is expected for log playback at regular speed since we have the pose updates for every step without having to run any physics.

Here is an example of log playback at double speed with the simple double pendulum world. We still achieve smooth playback and keep an RTF close to the theoretical expectation of 200% for 2x playback speed (on my machine, at least).

Signed-off-by: Ashton Larkin [email protected]

adlarkin · 2020-09-16T21:33:45Z

After taking a look at the failed tests in Jenkins, I believe that the failed tests can be ignored for the following reasons:

I copied the Docker setup from Jenkins over to my machine, and I could not reproduce the failed test from LogSystemTest.RecordAndPlayback. Perhaps the failure of this test in Jenkins had something to do with the CI machine?
There was a segmentation fault for INTEGRATION_triggered_publisher.test_ran, which also appears to be problematic in previous builds (see build 4138).
There was a test failure for DiffDriveTest.SkidPublishCmd, which also occurred in the previous build.

Considering that the DiffDrive test failure is an issue in other builds, I'm assuming that we can ignore the failure of this test in the GitHub action Ubuntu CI as well.

@chapulina @iche033 what do you guys think?

src/systems/log/LogPlayback.cc

Signed-off-by: Ashton Larkin <[email protected]>

Signed-off-by: Nate Koenig <[email protected]>

nkoenig · 2020-09-17T13:24:39Z

I'm noticing some jerky playback. I tested with a log file that has five robot models. Can you confirm?

nkoenig · 2020-09-17T13:27:22Z

The problem appears to be on the GUI side. The GUI process is running at over 200% CPU, and the server is running at 48% while maintaining a constant RTF.

adlarkin · 2020-09-17T15:33:05Z

I'm noticing some jerky playback. I tested with a log file that has five robot models. Can you confirm?

@nkoenig @iche033 I believe the jerky playback has been re-introduced by commit b07ebdd.

We cannot call clear on every Update, because if we clear cached poses but never parse the log file on a given Update cycle (which can happen depending on which timestep we are on - sometimes, the batch received from the log file is empty), then we have no updated poses to share with the SceneBroadcaster, which means that no rendering update will occur until a future log file Parse call aligns with the next update from the SceneBroadcaster (recall that the SceneBroadcaster publishes changes at 60Hz, NOT at every PreUpdate -> Update -> PostUpdate step - so, log file parsing and rendering updates don't always line up).

So, we must keep the poses that are currently cached until we know that we have new poses to overwrite the cache with, or else we run the risk of losing track of recent changes since Parse is not called every Update cycle (we know that we have new poses to overwrite the cache with whenever Parse is called).

Does that make sense? If not, I can try to explain things another way, but essentially, we have to remember that the EntityComponentManager keeps track of changed entities for a single PreUpdate -> Update -> PostUpdate step only - so, in order to make sure that pose updates from a step at time t are captured in a rendering update that occurs at time t + dt, we must keep these updates cached and only clear them if we have something to replace them with since they are a PeriodicChange, not a OneTimeChange. By calling clear() on the cached poses each step, we are assuming that we always have new pose updates to replace the cache with, which isn't necessarily true.

adlarkin · 2020-09-17T15:50:23Z

@nkoenig @iche033 as a follow up, I tested the code before and after commit b07ebdd, and noticed that jumpy playback occurs when calling clear every Update instead of in the first Parse call - so I think it's safe to say that jumpy playback is occurring because Parse isn't necessarily called every Update, as I described in my previous comment.

In order to achieve smooth playback, should we revert b07ebdd and go back to 06ba47b, or do either of you have any other preferences?

nkoenig · 2020-09-17T16:30:07Z

@nkoenig @iche033 as a follow up, I tested the code before and after commit b07ebdd, and noticed that jumpy playback occurs when calling clear every Update instead of in the first Parse call - so I think it's safe to say that jumpy playback is occurring because Parse isn't necessarily called every Update, as I described in my previous comment.

In order to achieve smooth playback, should we revert b07ebdd and go back to 06ba47b, or do either of you have any other preferences?

Thanks for the explanation. That makes sense. I'll revert my changes.

Signed-off-by: Nate Koenig <[email protected]>

nkoenig · 2020-09-17T16:39:09Z

Reverted.

Signed-off-by: Ashton Larkin <[email protected]>

iche033 · 2020-09-17T18:18:46Z

changes look good. Waiting for one last round of CI builds

nkoenig · 2020-09-17T22:51:28Z

Can I get one more review of this?

chapulina

Great work, @adlarkin ! Solves 2 problems at once 👍

codecov · 2020-09-17T23:39:39Z

Codecov Report

Merging #351 into ign-gazebo2 will increase coverage by 24.20%.
The diff coverage is 76.01%.

@@               Coverage Diff                @@
##           ign-gazebo2     #351       +/-   ##
================================================
+ Coverage        53.78%   77.99%   +24.20%     
================================================
  Files              121      186       +65     
  Lines             5838    10184     +4346     
================================================
+ Hits              3140     7943     +4803     
+ Misses            2698     2241      -457

Impacted Files	Coverage Δ
include/ignition/gazebo/EntityComponentManager.hh	`100.00% <ø> (ø)`
include/ignition/gazebo/Server.hh	`100.00% <ø> (ø)`
include/ignition/gazebo/ServerConfig.hh	`100.00% <ø> (ø)`
...nclude/ignition/gazebo/components/Serialization.hh	`61.29% <0.00%> (+61.29%)`	⬆️
.../plugins/component_inspector/ComponentInspector.cc	`9.41% <0.00%> (ø)`
src/systems/breadcrumbs/Breadcrumbs.hh	`100.00% <ø> (ø)`
src/systems/diff_drive/DiffDrive.hh	`100.00% <ø> (ø)`
src/systems/joint_controller/JointController.hh	`100.00% <ø> (ø)`
...stems/joint_state_publisher/JointStatePublisher.hh	`100.00% <ø> (ø)`
src/systems/user_commands/UserCommands.hh	`100.00% <ø> (ø)`
... and 136 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fe7b726...7663a6d. Read the comment docs.

* enhanced log playback performance Signed-off-by: Ashton Larkin <[email protected]> * Handling multiple Parse(...) calls in a single Update Signed-off-by: Ashton Larkin <[email protected]> * Change clear behavior Signed-off-by: Nate Koenig <[email protected]> * Revert changes Signed-off-by: Nate Koenig <[email protected]> * explain cached pose clearing behavior Signed-off-by: Ashton Larkin <[email protected]> * fixed whitespacing for codecheck Signed-off-by: Ashton Larkin <[email protected]> Co-authored-by: Nate Koenig <[email protected]> Co-authored-by: Ian Chen <[email protected]> Co-authored-by: Louise Poubel <[email protected]> Signed-off-by: Guillaume Doisy <[email protected]>

github-actions bot added the 📜 blueprint Ignition Blueprint label Sep 15, 2020

adlarkin marked this pull request as ready for review September 16, 2020 17:03

adlarkin requested a review from chapulina as a code owner September 16, 2020 17:03

adlarkin requested review from nkoenig and iche033 September 16, 2020 17:03

nkoenig reviewed Sep 16, 2020

View reviewed changes

src/systems/log/LogPlayback.cc Outdated Show resolved Hide resolved

adlarkin added 2 commits September 16, 2020 23:03

enhanced log playback performance

a1e48dd

Signed-off-by: Ashton Larkin <[email protected]>

Handling multiple Parse(...) calls in a single Update

06ba47b

Signed-off-by: Ashton Larkin <[email protected]>

adlarkin force-pushed the adlarkin/log_playback_improvements branch from 852fc9e to 06ba47b Compare September 17, 2020 03:03

nkoenig approved these changes Sep 17, 2020

View reviewed changes

Change clear behavior

b07ebdd

Signed-off-by: Nate Koenig <[email protected]>

adlarkin mentioned this pull request Sep 17, 2020

Playback Scrubber #299

Merged

Revert changes

c7c0118

Signed-off-by: Nate Koenig <[email protected]>

adlarkin and others added 3 commits September 17, 2020 14:07

explain cached pose clearing behavior

dee2afe

Signed-off-by: Ashton Larkin <[email protected]>

fixed whitespacing for codecheck

6155f34

Signed-off-by: Ashton Larkin <[email protected]>

Merge branch 'ign-gazebo2' into adlarkin/log_playback_improvements

6d7dd52

Merge branch 'ign-gazebo2' into adlarkin/log_playback_improvements

7663a6d

chapulina approved these changes Sep 17, 2020

View reviewed changes

iche033 approved these changes Sep 17, 2020

View reviewed changes

nkoenig merged commit b3a4719 into ign-gazebo2 Sep 17, 2020

nkoenig deleted the adlarkin/log_playback_improvements branch September 17, 2020 23:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhancing log playback performance #351

Enhancing log playback performance #351

adlarkin commented Sep 15, 2020 •

edited

Loading

adlarkin commented Sep 16, 2020

nkoenig commented Sep 17, 2020

nkoenig commented Sep 17, 2020

adlarkin commented Sep 17, 2020 •

edited

Loading

adlarkin commented Sep 17, 2020

nkoenig commented Sep 17, 2020

nkoenig commented Sep 17, 2020

iche033 commented Sep 17, 2020

nkoenig commented Sep 17, 2020

chapulina left a comment

codecov bot commented Sep 17, 2020

Enhancing log playback performance #351

Enhancing log playback performance #351

Conversation

adlarkin commented Sep 15, 2020 • edited Loading

adlarkin commented Sep 16, 2020

nkoenig commented Sep 17, 2020

nkoenig commented Sep 17, 2020

adlarkin commented Sep 17, 2020 • edited Loading

adlarkin commented Sep 17, 2020

nkoenig commented Sep 17, 2020

nkoenig commented Sep 17, 2020

iche033 commented Sep 17, 2020

nkoenig commented Sep 17, 2020

chapulina left a comment

Choose a reason for hiding this comment

codecov bot commented Sep 17, 2020

Codecov Report

adlarkin commented Sep 15, 2020 •

edited

Loading

adlarkin commented Sep 17, 2020 •

edited

Loading