Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug][Go SDK]: Checkpointing should "split" restrictions to claimed positions. #24966

Closed
1 of 15 tasks
lostluck opened this issue Jan 10, 2023 · 1 comment
Closed
1 of 15 tasks
Assignees
Labels
bug done & done Issue has been reviewed after it was closed for verification, followups, etc. go P1

Comments

@lostluck
Copy link
Contributor

What happened?

The Go SDK doesn't currently handle process continuations correctly when partial work has been done.

The correct behavior is that any claimed positions are assumed to be processed by the primary, leaving the rest to the residual. That is, it should be possible for a DoFn to eventually process all of a restriction a single position at a time, by always returning process continuation delays after each claimed restriction.

At present, the original restriction is what is used, instead splitting the restriction at the claimed position boundary, leading to infinite processing.

Related #24931 which rendered these independent by element.

Issue Priority

Priority: 1 (data loss / total loss of function)

Issue Components

  • Component: Python SDK
  • Component: Java SDK
  • Component: Go SDK
  • Component: Typescript SDK
  • Component: IO connector
  • Component: Beam examples
  • Component: Beam playground
  • Component: Beam katas
  • Component: Website
  • Component: Spark Runner
  • Component: Flink Runner
  • Component: Samza Runner
  • Component: Twister2 Runner
  • Component: Hazelcast Jet Runner
  • Component: Google Cloud Dataflow Runner
@lostluck lostluck added this to the 2.45.0 Release milestone Jan 10, 2023
@lostluck lostluck self-assigned this Jan 10, 2023
@lostluck
Copy link
Contributor Author

Further validation showed that this is already how the SDK works. I must have remembered some older bad output from the replacement runner for #24789.

I may circle back to this with some additional "split on 0" and "split on 1" unit tests for the offset range restriction, but it's currently a non-issue.

@lostluck lostluck added the done & done Issue has been reviewed after it was closed for verification, followups, etc. label Jan 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug done & done Issue has been reviewed after it was closed for verification, followups, etc. go P1
Projects
None yet
Development

No branches or pull requests

1 participant