Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Properly handle in-flight deletes followed by adds in OrderedListState #28171

Merged
merged 1 commit into from
Sep 11, 2023

Conversation

reuvenlax
Copy link
Contributor

No description provided.

@github-actions
Copy link
Contributor

Assigning reviewers. If you would like to opt out of this review, comment assign to next reviewer:

R: @lostluck added as fallback since no labels match configuration

Available commands:

  • stop reviewer notifications - opt out of the automated review tooling
  • remind me after tests pass - tag the comment author after tests pass
  • waiting on author - shift the attention set back to the author (any comment or push by the author will return the attention set to the reviewers)

The PR bot will only process comments in the main thread (not review comments).

Iterables.filter(
Iterables.transform(includingAdds, TimestampedValueWithId::getValue),
tv -> !pendingDeletes.contains(tv.getTimestamp()));
Iterables.transform(includingAdds, TimestampedValueWithId::getValue);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if a delete was issued after an add? It feels like the timing of overlapping deletes and adds also needs to be considered.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This case was already correctly handled. When clearRange is called, any pending adds in that range are removed from pendingAdds:

https://github.com/apache/beam/blob/master/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateInternals.java#L1001

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The last delete will win because of the filter clause at 950 and the last insert will win because of the transform that will merge the add at 975?

Copy link
Contributor

@slilichenko slilichenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM


orderedListState.add(TimestampedValue.of("second", Instant.ofEpochMilli(1)));
orderedListState.add(TimestampedValue.of("third", Instant.ofEpochMilli(2)));
orderedListState.add(TimestampedValue.of("fourth", Instant.ofEpochMilli(2)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it a typo for the "fourth" element - same ts as "third" and gets added later with a different ts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no - I delete it and then add it back with a different ts (though one still within the deletion range)

TimestampedValue.of("fourth", Instant.ofEpochMilli(4)),
TimestampedValue.of("fifth", Instant.ofEpochMilli(5)),
TimestampedValue.of("sixth", Instant.ofEpochMilli(5)),
TimestampedValue.of("seventh", Instant.ofEpochMilli(5)),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Side note - would be good to describe the behavior of the list when multiple elements are added with the same timestamp (the fact that it behaves as a multimap and ordering guarantees for the entries under the same key, if any).

@reuvenlax reuvenlax force-pushed the ordered_list_state_consistency branch from 57a8ccf to f2f5bf6 Compare September 1, 2023 23:06
@reuvenlax
Copy link
Contributor Author

Run Java PreCommit

@github-actions
Copy link
Contributor

Reminder, please take a look at this pr: @lostluck

@reuvenlax reuvenlax merged commit 5e4b9bb into apache:master Sep 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants