Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update releases 2.33.0 and 2.34.0 to include broken BQ copy jobs in batch as a known issue #27563

Merged
merged 5 commits into from
Jul 20, 2023
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions website/www/site/content/en/blog/beam-2.33.0.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,7 @@ notes](https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527
* Spark 2.x users will need to update Spark's Jackson runtime dependencies (`spark.jackson.version`) to at least version 2.9.2, due to Beam updating its dependencies.
* See a full list of open [issues that affect](https://issues.apache.org/jira/issues/?jql=project%20%3D%20BEAM%20AND%20affectedVersion%20%3D%202.33.0%20ORDER%20BY%20priority%20DESC%2C%20updated%20DESC) this version.
* Go SDK jobs may produce "Failed to deduce Step from MonitoringInfo" messages following successful job execution. The messages are benign and don't indicate job failure. These are due to not yet handling PCollection metrics.
* Large BigQueryIO writes that use file loads method will fail in batch mode. Specifically, writes that are large enough to use copy jobs.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should mention what kind of errors users will see and what the action users need to do. If you have a github issue, you could add more details there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have an issue for this, but I'll add an error description.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PTAL

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about this?

Large BigQueryIO writes with the FILE_LOADS method might fail in batch mode when large writes use copy jobs. The resulted in error message is IllegalArgumentException: Attempting to access unknown side input. Please upgrade to a newer version (> 2.34.0) or use another write method (STORAGE_WRITE_API).

And does this only affect Java SDK?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed it, PTAL


## List of Contributors

Expand Down
4 changes: 4 additions & 0 deletions website/www/site/content/en/blog/beam-2.34.0.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,10 @@ notes](https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527
* Fixed error when importing the DataFrame API with pandas 1.0.x installed ([BEAM-12945](https://issues.apache.org/jira/browse/BEAM-12945)).
* Fixed top.SmallestPerKey implementation in the Go SDK ([BEAM-12946](https://issues.apache.org/jira/browse/BEAM-12946)).

### Known Issues

* Large BigQueryIO writes that use file loads method will fail in batch mode. Specifically, writes that are large enough to use copy jobs.

## List of Contributors

According to git shortlog, the following people contributed to the 2.34.0 release. Thank you to all contributors!
Expand Down