Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(11397): surface proper errors in ParquetSink #11399

Merged
merged 2 commits into from
Jul 12, 2024

Conversation

wiedld
Copy link
Contributor

@wiedld wiedld commented Jul 10, 2024

Which issue does this PR close?

Closes #11397

Rationale for this change

During the parallel writes in ParquetSink, we spawn a series of parallel tasks and then message pass the outcome from one task to the next. In abstraction:
read_batches => channel => Vec<col_write_tasks> => channel => Vec<serialize_rowgroup_tasks>

When we encounter an error in one of the Vec<x_tasks> we are first surfacing an error on the channel.send() rather than on the task join.

What changes are included in this PR?

Don't surface the errors on the channel send.
This results in the proper error returned, as can be seen on the updated test.

Are these changes tested?

Yes.

Are there any user-facing changes?

No.

@github-actions github-actions bot added the core Core DataFusion crate label Jul 10, 2024
@wiedld wiedld marked this pull request as ready for review July 10, 2024 19:23
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! I think we need to return early here rather than ignoring the error

BTW I tried returning early locally and the test still passes

datafusion/core/src/datasource/file_format/parquet.rs Outdated Show resolved Hide resolved
@comphead
Copy link
Contributor

I agree with @alamb early return is way better than ignoring error.
Ignoring error leads to unpredictable behavior

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me -- thanks @wiedld

@alamb alamb merged commit 1dfac86 into apache:main Jul 12, 2024
23 checks passed
Lordworms pushed a commit to Lordworms/arrow-datafusion that referenced this pull request Jul 12, 2024
* fix(11397): do not surface errors for closed channels, and instead let the task join errors be surfaced

* fix(11397): terminate early on channel send failure
@alamb alamb deleted the 11397/surface-proper-errors branch July 12, 2024 18:25
wiedld added a commit to influxdata/arrow-datafusion that referenced this pull request Jul 12, 2024
* fix(11397): do not surface errors for closed channels, and instead let the task join errors be surfaced

* fix(11397): terminate early on channel send failure
appletreeisyellow pushed a commit to influxdata/arrow-datafusion that referenced this pull request Jul 12, 2024
* fix(11397): do not surface errors for closed channels, and instead let the task join errors be surfaced

* fix(11397): terminate early on channel send failure
findepi pushed a commit to findepi/datafusion that referenced this pull request Jul 16, 2024
* fix(11397): do not surface errors for closed channels, and instead let the task join errors be surfaced

* fix(11397): terminate early on channel send failure
xinlifoobar pushed a commit to xinlifoobar/datafusion that referenced this pull request Jul 17, 2024
* fix(11397): do not surface errors for closed channels, and instead let the task join errors be surfaced

* fix(11397): terminate early on channel send failure
appletreeisyellow pushed a commit to influxdata/arrow-datafusion that referenced this pull request Jul 17, 2024
* fix(11397): do not surface errors for closed channels, and instead let the task join errors be surfaced

* fix(11397): terminate early on channel send failure

Add Optimizer Sanity Checker, improve sortedness equivalence properties (apache#11196)

* Initial optimizer sanity checker.

Only includes sort reqs, docs will be added.

* Add distro and pipeline friendly checks

* Also check the plans we create are correct.

* Add distribution test cases using global limit exec.

* Add test for multiple children using SortMergeJoinExec.

* Move PipelineChecker to SanityCheckPlan

* Fix some tests and add docs

* Add some test docs and fix clippy diagnostics.

* Fix some failing tests

* Replace PipelineChecker with SanityChecker in .slt files.

* Initial commit

* Slt tests pass

* Resolve linter errors

* Minor changes

* Minor changes

* Minor changes

* Minor changes

* Sort PreservingMerge clear per partition

* Minor changes

* Update output_requirements.rs

* Address reviews

* Update datafusion/core/src/physical_optimizer/optimizer.rs

Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* Update datafusion/core/src/physical_optimizer/sanity_checker.rs

Co-authored-by: Mehmet Ozan Kabak <[email protected]>

* Address reviews

* Minor changes

* Apply suggestions from code review

Co-authored-by: Andrew Lamb <[email protected]>

* Update comment

* Add map implementation

---------

Co-authored-by: Erman Yafay <[email protected]>
Co-authored-by: berkaysynnada <[email protected]>
Co-authored-by: Mehmet Ozan Kabak <[email protected]>
Co-authored-by: Andrew Lamb <[email protected]>
xinlifoobar pushed a commit to xinlifoobar/datafusion that referenced this pull request Jul 18, 2024
* fix(11397): do not surface errors for closed channels, and instead let the task join errors be surfaced

* fix(11397): terminate early on channel send failure
appletreeisyellow pushed a commit to influxdata/arrow-datafusion that referenced this pull request Jul 22, 2024
* fix(11397): do not surface errors for closed channels, and instead let the task join errors be surfaced

* fix(11397): terminate early on channel send failure
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve error messages for parallel parquet writer "Unable to send array to writer!"
3 participants