Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: pruning by bloom filters for dictionary columns #13768

Merged
merged 1 commit into from
Dec 17, 2024

Conversation

korowa
Copy link
Contributor

@korowa korowa commented Dec 13, 2024

Which issue does this PR close?

Closes #13574 .

Rationale for this change

Currently row group pruning by Bloom filters is unaware of Dictionary typed scalar values (the literal part of predicate) while checking for values being contained in filter. This PR adds limited support for them -- now Bloom filter pruning is able to handle majority of Dictionary scalars (with the exception of Decimal type).

What changes are included in this PR?

Part of BloomFilterStatistics responsible for checking scalar value against Sbbf is moved into helper function with additional support of Dictionary() and LargeUtf8/LargeBinary type.

Are these changes tested?

There are tests for supported datatypes checking pruning and matching (to ensure that there won't be any excessive pruning) cases.

Are there any user-facing changes?

No

@github-actions github-actions bot added the core Core DataFusion crate label Dec 13, 2024
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @korowa -- this looks good to me

I had a question about binary, but I also think this PR could be merged as is

cc @adriangb

},
// Bloom filter pruning is performed only for Utf8 dictionary types since
// pruning predicate is not created for Dictionary(Numeric/Binary) types
ScalarValue::Dictionary(_, inner) => match inner.as_ref() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it isn't clear to me that it is impossible to use Dictionary(Int8, Int64)` or something to encode Int64 values, but I think this is the most common example

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, after rechecking I've found out that it's only a matter of casting literal to exact column type. I'll update filter check function and will add additional tests for more data types.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like we maybe don't even need to chack the inner type explicitly (it would be checked by BloomFilterStatistics::check_scalar as well). However I think this is better than what is on main today and if it is important we can add support for other types

_ => true,
}
})
.map(|value| BloomFilterStatistics::check_scalar(sbbf, value, parquet_type))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

// Bloom filter pruning is performed only for Utf8 dictionary types since
// pruning predicate is not created for Dictionary(Numeric/Binary) types
ScalarValue::Dictionary(_, inner) => match inner.as_ref() {
ScalarValue::Utf8(_) | ScalarValue::LargeUtf8(_) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you also mean to check ScalarValue::{Binary,LargeBinary} here as well?

datafusion/core/tests/parquet/mod.rs Show resolved Hide resolved
@adriangb
Copy link
Contributor

The reason for not including all already supported ParquetExecBuilder::build` is not able to produce pruning predicate for Dictionary(numeric/binary) datatypes, so these queries won't normally reach bloom filter/statistics pruning.

Is this true? So Dictionary columns are incompatible with predicate pruning based on stats as well?

@adriangb
Copy link
Contributor

Looks very nice to me thank you @korowa !

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also double checked that the newly added tests fail without the corresponding code change in this PR

failures:

---- parquet::row_group_pruning::test_bloom_filter_dict stdout ----
Planning sql SELECT * FROM t WHERE utf8 = 'h'
Input:
+------+------------+--------+--------------+
| utf8 | large_utf8 | binary | large_binary |
+------+------------+--------+--------------+
| a    | a          | 61     | 61           |
| b    | b          | 62     | 62           |
| c    | c          | 63     | 63           |
| d    | d          | 64     | 64           |
| e    | e          | 65     | 65           |
| f    | f          | 66     | 66           |
| g    | g          | 67     | 67           |
| h    | h          | 68     | 68           |
| i    | i          | 69     | 69           |
| j    | j          | 6a     | 6a           |
+------+------------+--------+--------------+
Query:
SELECT * FROM t WHERE utf8 = 'h'
Output:
+------+------------+--------+--------------+
| utf8 | large_utf8 | binary | large_binary |
+------+------------+--------+--------------+
| h    | h          | 68     | 68           |
+------+------------+--------+--------------+
Metrics:
num_predicate_creation_errors=0, time_elapsed_opening{partition=0}=13.856ms, time_elapsed_scanning_until_data{partition=0}=170.75µs, time_elapsed_scanning_total{partition=0}=222.541µs, time_elapsed_processing{partition=0}=13.786957ms, file_open_errors{partition=0}=0, file_scan_errors{partition=0}=0, start_timestamp{partition=0}=2024-12-16 15:44:59.575102 UTC, end_timestamp{partition=0}=2024-12-16 15:44:59.589198 UTC, elapsed_compute{partition=0}=NOT RECORDED, output_rows{partition=0}=5, predicate_evaluation_errors{partition=0, filename=var/folders/1l/tg68jc6550gg8xqf1hr4mlwr0000gn/T/parquet_pruning6O2mvF.parquet}=0, row_groups_matched_bloom_filter{partition=0, filename=var/folders/1l/tg68jc6550gg8xqf1hr4mlwr0000gn/T/parquet_pruning6O2mvF.parquet}=1, row_groups_pruned_bloom_filter{partition=0, filename=var/folders/1l/tg68jc6550gg8xqf1hr4mlwr0000gn/T/parquet_pruning6O2mvF.parquet}=0, row_groups_matched_statistics{partition=0, filename=var/folders/1l/tg68jc6550gg8xqf1hr4mlwr0000gn/T/parquet_pruning6O2mvF.parquet}=1, row_groups_pruned_statistics{partition=0, filename=var/folders/1l/tg68jc6550gg8xqf1hr4mlwr0000gn/T/parquet_pruning6O2mvF.parquet}=1, bytes_scanned{partition=0, filename=var/folders/1l/tg68jc6550gg8xqf1hr4mlwr0000gn/T/parquet_pruning6O2mvF.parquet}=0, pushdown_rows_pruned{partition=0, filename=var/folders/1l/tg68jc6550gg8xqf1hr4mlwr0000gn/T/parquet_pruning6O2mvF.parquet}=0, pushdown_rows_matched{partition=0, filename=var/folders/1l/tg68jc6550gg8xqf1hr4mlwr0000gn/T/parquet_pruning6O2mvF.parquet}=0, row_pushdown_eval_time{partition=0, filename=var/folders/1l/tg68jc6550gg8xqf1hr4mlwr0000gn/T/parquet_pruning6O2mvF.parquet}=NOT RECORDED, statistics_eval_time{partition=0, filename=var/folders/1l/tg68jc6550gg8xqf1hr4mlwr0000gn/T/parquet_pruning6O2mvF.parquet}=84.792µs, bloom_filter_eval_time{partition=0, filename=var/folders/1l/tg68jc6550gg8xqf1hr4mlwr0000gn/T/parquet_pruning6O2mvF.parquet}=13.36025ms, page_index_rows_pruned{partition=0, filename=var/folders/1l/tg68jc6550gg8xqf1hr4mlwr0000gn/T/parquet_pruning6O2mvF.parquet}=0, page_index_rows_matched{partition=0, filename=var/folders/1l/tg68jc6550gg8xqf1hr4mlwr0000gn/T/parquet_pruning6O2mvF.parquet}=5, page_index_eval_time{partition=0, filename=var/folders/1l/tg68jc6550gg8xqf1hr4mlwr0000gn/T/parquet_pruning6O2mvF.parquet}=86.5µs, metadata_load_time{partition=0, filename=var/folders/1l/tg68jc6550gg8xqf1hr4mlwr0000gn/T/parquet_pruning6O2mvF.parquet}=295.625µs, predicate_evaluation_errors{partition=0, filename=var/folders/1l/tg68jc6550gg8xqf1hr4mlwr0000gn/T/parquet_pruning6O2mvF.parquet}=0, row_groups_matched_bloom_filter{partition=0, filename=var/folders/1l/tg68jc6550gg8xqf1hr4mlwr0000gn/T/parquet_pruning6O2mvF.parquet}=0, row_groups_pruned_bloom_filter{partition=0, filename=var/folders/1l/tg68jc6550gg8xqf1hr4mlwr0000gn/T/parquet_pruning6O2mvF.parquet}=0, row_groups_matched_statistics{partition=0, filename=var/folders/1l/tg68jc6550gg8xqf1hr4mlwr0000gn/T/parquet_pruning6O2mvF.parquet}=0, row_groups_pruned_statistics{partition=0, filename=var/folders/1l/tg68jc6550gg8xqf1hr4mlwr0000gn/T/parquet_pruning6O2mvF.parquet}=0, bytes_scanned{partition=0, filename=var/folders/1l/tg68jc6550gg8xqf1hr4mlwr0000gn/T/parquet_pruning6O2mvF.parquet}=1049141, pushdown_rows_pruned{partition=0, filename=var/folders/1l/tg68jc6550gg8xqf1hr4mlwr0000gn/T/parquet_pruning6O2mvF.parquet}=0, pushdown_rows_matched{partition=0, filename=var/folders/1l/tg68jc6550gg8xqf1hr4mlwr0000gn/T/parquet_pruning6O2mvF.parquet}=0, row_pushdown_eval_time{partition=0, filename=var/folders/1l/tg68jc6550gg8xqf1hr4mlwr0000gn/T/parquet_pruning6O2mvF.parquet}=NOT RECORDED, statistics_eval_time{partition=0, filename=var/folders/1l/tg68jc6550gg8xqf1hr4mlwr0000gn/T/parquet_pruning6O2mvF.parquet}=NOT RECORDED, bloom_filter_eval_time{partition=0, filename=var/folders/1l/tg68jc6550gg8xqf1hr4mlwr0000gn/T/parquet_pruning6O2mvF.parquet}=NOT RECORDED, page_index_rows_pruned{partition=0, filename=var/folders/1l/tg68jc6550gg8xqf1hr4mlwr0000gn/T/parquet_pruning6O2mvF.parquet}=0, page_index_rows_matched{partition=0, filename=var/folders/1l/tg68jc6550gg8xqf1hr4mlwr0000gn/T/parquet_pruning6O2mvF.parquet}=0, page_index_eval_time{partition=0, filename=var/folders/1l/tg68jc6550gg8xqf1hr4mlwr0000gn/T/parquet_pruning6O2mvF.parquet}=NOT RECORDED, metadata_load_time{partition=0, filename=var/folders/1l/tg68jc6550gg8xqf1hr4mlwr0000gn/T/parquet_pruning6O2mvF.parquet}=NOT RECORDED
Planning sql SELECT * FROM t WHERE utf8 = 'ab'
Input:
+------+------------+--------+--------------+
| utf8 | large_utf8 | binary | large_binary |
+------+------------+--------+--------------+
| a    | a          | 61     | 61           |
| b    | b          | 62     | 62           |
| c    | c          | 63     | 63           |
| d    | d          | 64     | 64           |
| e    | e          | 65     | 65           |
| f    | f          | 66     | 66           |
| g    | g          | 67     | 67           |
| h    | h          | 68     | 68           |
| i    | i          | 69     | 69           |
| j    | j          | 6a     | 6a           |
+------+------------+--------+--------------+
Query:
SELECT * FROM t WHERE utf8 = 'ab'
Output:
++
++
Metrics:
num_predicate_creation_errors=0, time_elapsed_opening{partition=0}=13.949417ms, time_elapsed_scanning_until_data{partition=0}=186.875µs, time_elapsed_scanning_total{partition=0}=236.416µs, time_elapsed_processing{partition=0}=13.85075ms, file_open_errors{partition=0}=0, file_scan_errors{partition=0}=0, start_timestamp{partition=0}=2024-12-16 15:44:59.709470 UTC, end_timestamp{partition=0}=2024-12-16 15:44:59.723669 UTC, elapsed_compute{partition=0}=NOT RECORDED, output_rows{partition=0}=5, predicate_evaluation_errors{partition=0, filename=var/folders/1l/tg68jc6550gg8xqf1hr4mlwr0000gn/T/parquet_pruning3gshjh.parquet}=0, row_groups_matched_bloom_filter{partition=0, filename=var/folders/1l/tg68jc6550gg8xqf1hr4mlwr0000gn/T/parquet_pruning3gshjh.parquet}=1, row_groups_pruned_bloom_filter{partition=0, filename=var/folders/1l/tg68jc6550gg8xqf1hr4mlwr0000gn/T/parquet_pruning3gshjh.parquet}=0, row_groups_matched_statistics{partition=0, filename=var/folders/1l/tg68jc6550gg8xqf1hr4mlwr0000gn/T/parquet_pruning3gshjh.parquet}=1, row_groups_pruned_statistics{partition=0, filename=var/folders/1l/tg68jc6550gg8xqf1hr4mlwr0000gn/T/parquet_pruning3gshjh.parquet}=1, bytes_scanned{partition=0, filename=var/folders/1l/tg68jc6550gg8xqf1hr4mlwr0000gn/T/parquet_pruning3gshjh.parquet}=0, pushdown_rows_pruned{partition=0, filename=var/folders/1l/tg68jc6550gg8xqf1hr4mlwr0000gn/T/parquet_pruning3gshjh.parquet}=0, pushdown_rows_matched{partition=0, filename=var/folders/1l/tg68jc6550gg8xqf1hr4mlwr0000gn/T/parquet_pruning3gshjh.parquet}=0, row_pushdown_eval_time{partition=0, filename=var/folders/1l/tg68jc6550gg8xqf1hr4mlwr0000gn/T/parquet_pruning3gshjh.parquet}=NOT RECORDED, statistics_eval_time{partition=0, filename=var/folders/1l/tg68jc6550gg8xqf1hr4mlwr0000gn/T/parquet_pruning3gshjh.parquet}=76.041µs, bloom_filter_eval_time{partition=0, filename=var/folders/1l/tg68jc6550gg8xqf1hr4mlwr0000gn/T/parquet_pruning3gshjh.parquet}=13.426708ms, page_index_rows_pruned{partition=0, filename=var/folders/1l/tg68jc6550gg8xqf1hr4mlwr0000gn/T/parquet_pruning3gshjh.parquet}=0, page_index_rows_matched{partition=0, filename=var/folders/1l/tg68jc6550gg8xqf1hr4mlwr0000gn/T/parquet_pruning3gshjh.parquet}=5, page_index_eval_time{partition=0, filename=var/folders/1l/tg68jc6550gg8xqf1hr4mlwr0000gn/T/parquet_pruning3gshjh.parquet}=84µs, metadata_load_time{partition=0, filename=var/folders/1l/tg68jc6550gg8xqf1hr4mlwr0000gn/T/parquet_pruning3gshjh.parquet}=333.291µs, predicate_evaluation_errors{partition=0, filename=var/folders/1l/tg68jc6550gg8xqf1hr4mlwr0000gn/T/parquet_pruning3gshjh.parquet}=0, row_groups_matched_bloom_filter{partition=0, filename=var/folders/1l/tg68jc6550gg8xqf1hr4mlwr0000gn/T/parquet_pruning3gshjh.parquet}=0, row_groups_pruned_bloom_filter{partition=0, filename=var/folders/1l/tg68jc6550gg8xqf1hr4mlwr0000gn/T/parquet_pruning3gshjh.parquet}=0, row_groups_matched_statistics{partition=0, filename=var/folders/1l/tg68jc6550gg8xqf1hr4mlwr0000gn/T/parquet_pruning3gshjh.parquet}=0, row_groups_pruned_statistics{partition=0, filename=var/folders/1l/tg68jc6550gg8xqf1hr4mlwr0000gn/T/parquet_pruning3gshjh.parquet}=0, bytes_scanned{partition=0, filename=var/folders/1l/tg68jc6550gg8xqf1hr4mlwr0000gn/T/parquet_pruning3gshjh.parquet}=1049141, pushdown_rows_pruned{partition=0, filename=var/folders/1l/tg68jc6550gg8xqf1hr4mlwr0000gn/T/parquet_pruning3gshjh.parquet}=0, pushdown_rows_matched{partition=0, filename=var/folders/1l/tg68jc6550gg8xqf1hr4mlwr0000gn/T/parquet_pruning3gshjh.parquet}=0, row_pushdown_eval_time{partition=0, filename=var/folders/1l/tg68jc6550gg8xqf1hr4mlwr0000gn/T/parquet_pruning3gshjh.parquet}=NOT RECORDED, statistics_eval_time{partition=0, filename=var/folders/1l/tg68jc6550gg8xqf1hr4mlwr0000gn/T/parquet_pruning3gshjh.parquet}=NOT RECORDED, bloom_filter_eval_time{partition=0, filename=var/folders/1l/tg68jc6550gg8xqf1hr4mlwr0000gn/T/parquet_pruning3gshjh.parquet}=NOT RECORDED, page_index_rows_pruned{partition=0, filename=var/folders/1l/tg68jc6550gg8xqf1hr4mlwr0000gn/T/parquet_pruning3gshjh.parquet}=0, page_index_rows_matched{partition=0, filename=var/folders/1l/tg68jc6550gg8xqf1hr4mlwr0000gn/T/parquet_pruning3gshjh.parquet}=0, page_index_eval_time{partition=0, filename=var/folders/1l/tg68jc6550gg8xqf1hr4mlwr0000gn/T/parquet_pruning3gshjh.parquet}=NOT RECORDED, metadata_load_time{partition=0, filename=var/folders/1l/tg68jc6550gg8xqf1hr4mlwr0000gn/T/parquet_pruning3gshjh.parquet}=NOT RECORDED
thread 'parquet::row_group_pruning::test_bloom_filter_dict' panicked at datafusion/core/tests/parquet/row_group_pruning.rs:124:9:
assertion `left == right` failed: mismatched row_groups_matched_bloom_filter
  left: Some(1)
 right: Some(0)
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace


failures:
    parquet::row_group_pruning::test_bloom_filter_dict

test result: FAILED. 171 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out; finished in 2.40s

@korowa
Copy link
Contributor Author

korowa commented Dec 16, 2024

Is this true? So Dictionary columns are incompatible with predicate pruning based on stats as well?

No, it's not true, it was my mistake -- it works fine if literal is explicitly casted to column type.

@korowa korowa force-pushed the fix-bloom-filter-dict branch from 6a63160 to a82f94e Compare December 17, 2024 18:20
@korowa korowa changed the title fix: pruning by bloom filters for utf8 dictionary columns fix: pruning by bloom filters for dictionary columns Dec 17, 2024
@korowa
Copy link
Contributor Author

korowa commented Dec 17, 2024

I've added more tests along with types support, and updated PR description.

There are two issues I'm going to debug and create tickets or fix after getting more understanding of what's wrong there (they don't affect current implementation):

@korowa korowa force-pushed the fix-bloom-filter-dict branch from a82f94e to 5c9d66e Compare December 17, 2024 20:31
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again @korowa

},
// Bloom filter pruning is performed only for Utf8 dictionary types since
// pruning predicate is not created for Dictionary(Numeric/Binary) types
ScalarValue::Dictionary(_, inner) => match inner.as_ref() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like we maybe don't even need to chack the inner type explicitly (it would be checked by BloomFilterStatistics::check_scalar as well). However I think this is better than what is on main today and if it is important we can add support for other types

@alamb alamb merged commit 452a8f4 into apache:main Dec 17, 2024
25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bloom filters don't work with Dictionary encoded columns
3 participants