Relax combine partial final rule #10913

mustafasrepo · 2024-06-14T13:44:22Z

Which issue does this PR close?

Closes #.

Rationale for this change

Currently, CombinePartialFinalAggregate rule combine AggregateExec::Partial + AggregateExec::Final into AggregateExec::Single when following conditions are met (same conditions applies to the combining AggregateExec::Partial + AggregateExec::FinalPartitioned into AggregateExec::SinglePartitioned):

These operators are consecutive
Their group by expressions are equal
Their aggregate expressions are equal
Their filter expressions are equal.
See can_combine function for implementation.

However, the query below

SELECT
    DATE_BIN(INTERVAL '2' MINUTE, ts, '2000-01-01') AS ts_chunk,
    ARRAY_AGG(DISTINCT keyword) AS keywords,
    COUNT(keyword) AS alert_keyword_count
FROM
    keywords_stream
WHERE
    keywords_stream.keyword IN (SELECT keyword FROM ALERT_KEYWORDS)
GROUP BY
    ts_chunk;

where keywords_stream is defined as

CREATE TABLE keywords_stream (
    ts TIMESTAMP,
    sn INTEGER PRIMARY KEY,
    keyword VARCHAR NOT NULL
);

and ALERT_KEYWORDS is defined as

CREATE TABLE ALERT_KEYWORDS(keyword VARCHAR NOT NULL);

generates following plan

01)ProjectionExec: expr=[date_bin(IntervalMonthDayNano("IntervalMonthDayNano { months: 0, days: 0, nanoseconds: 120000000000 }"),keywords_stream.ts,Utf8("2000-01-01"))@0 as ts_chunk, COUNT(keywords_stream.keyword)@1 as alert_keyword_count]
02)--AggregateExec: mode=Final, gby=[date_bin(IntervalMonthDayNano("IntervalMonthDayNano { months: 0, days: 0, nanoseconds: 120000000000 }"),keywords_stream.ts,Utf8("2000-01-01"))@0 as date_bin(IntervalMonthDayNano("IntervalMonthDayNano { months: 0, days: 0, nanoseconds: 120000000000 }"),keywords_stream.ts,Utf8("2000-01-01"))], aggr=[COUNT(keywords_stream.keyword)]
03)----AggregateExec: mode=Partial, gby=[date_bin(IntervalMonthDayNano { months: 0, days: 0, nanoseconds: 120000000000 }, ts@0, 946684800000000000) as date_bin(IntervalMonthDayNano("IntervalMonthDayNano { months: 0, days: 0, nanoseconds: 120000000000 }"),keywords_stream.ts,Utf8("2000-01-01"))], aggr=[COUNT(keywords_stream.keyword)]
04)------CoalesceBatchesExec: target_batch_size=2
05)--------HashJoinExec: mode=CollectLeft, join_type=RightSemi, on=[(keyword@0, keyword@1)]
06)----------MemoryExec: partitions=1, partition_sizes=[1]
07)----------MemoryExec: partitions=1, partition_sizes=[1]

where AggregateExec::Partial and AggregateExec::Final couldn't combined into AggregateExec::Single. However, we should be able to generate following plan

01)ProjectionExec: expr=[date_bin(IntervalMonthDayNano("IntervalMonthDayNano { months: 0, days: 0, nanoseconds: 120000000000 }"),keywords_stream.ts,Utf8("2000-01-01"))@0 as ts_chunk, COUNT(keywords_stream.keyword)@1 as alert_keyword_count]
02)--AggregateExec: mode=Single, gby=[date_bin(IntervalMonthDayNano { months: 0, days: 0, nanoseconds: 120000000000 }, ts@0, 946684800000000000) as date_bin(IntervalMonthDayNano("IntervalMonthDayNano { months: 0, days: 0, nanoseconds: 120000000000 }"),keywords_stream.ts,Utf8("2000-01-01"))], aggr=[COUNT(keywords_stream.keyword)]
03)----CoalesceBatchesExec: target_batch_size=2
04)------HashJoinExec: mode=CollectLeft, join_type=RightSemi, on=[(keyword@0, keyword@1)]
05)--------MemoryExec: partitions=1, partition_sizes=[1]
06)--------MemoryExec: partitions=1, partition_sizes=[1]

with AggregateExec: mode=Single operator. The reason we cannot current do this change is that group by expressions of the AggregateExec: mode=Partial and AggregateExec: mode=Final are not same (Partial has scalar Function, Final has Column expressions which is the result of the scalar function).
However, As far as I can tell, as long as AggregateExec::Partial and AggregateExec::Final are consecutive, we can combine these operators into AggregateExec::Single (It is guaranteed for these operators to be related and their partitioning is same). Hence, can_combine should be more like invariance check. Looking into the planner. By invariance, aggregate expressions and filter expressions should have exactly same expressions. However, group by expressions can be different (partial group by might be complex expression, final group by will be its result in column form). Hence, I propose to relax this group by equality check to generate better plans in the single partition plans..

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

ozankabak

LGTM, but let's wait for additional community review in case we are missing something.

alamb · 2024-06-15T17:36:42Z

datafusion/sqllogictest/test_files/joins.slt

-14)------------------RepartitionExec: partitioning=Hash([t2_id@0], 2), input_partitions=2
-15)--------------------RepartitionExec: partitioning=RoundRobinBatch(2), input_partitions=1
-16)----------------------MemoryExec: partitions=1, partition_sizes=[1]
+05)--------AggregateExec: mode=SinglePartitioned, gby=[t1_id@0 as alias1], aggr=[]


I agree this plan looks better and correct

alamb · 2024-06-15T17:40:49Z

datafusion/core/src/physical_optimizer/combine_partial_final_agg.rs

@@ -144,8 +144,12 @@ fn can_combine(final_agg: GroupExprsRef, partial_agg: GroupExprsRef) -> bool {
    let (input_group_by, input_aggr_expr, input_filter_expr) =
        normalize_group_exprs(partial_agg);

-    final_group_by.eq(&input_group_by)


I am not sure that just checking the length of the group bys is sufficient -- I think logically they must be the same.

It seems like the reason these weren't combined

05)--------AggregateExec: mode=FinalPartitioned, gby=[alias1@0 as alias1], aggr=[] 06)----------AggregateExec: mode=Partial, gby=[t1_id@0 as alias1], aggr=[]

Is becase of aliasing the exprs didn't match exactly -- t1_id@0 as alias1 didn't match alias1@0 as alias1 even though I think they are logically equivalent

So for example, if we ever made the following plan (with actually different grouping expressions) after this change the code would incorrectly collapse them

05)--------AggregateExec: mode=FinalPartitioned, gby=[alias1 / 2 as alias1], aggr=[] 06)----------AggregateExec: mode=Partial, gby=[t1_id@0 as alias1], aggr=[]

However, I am not sure that such a plan would be valid 🤔

We were trying to think whether it is possible for a valid plan to have a consecutive Partial/Final duo with differing GROUP BY expressions (unless of course it is manually generated that way w/o a query).

We weren't able to find an example of this and started to think it is not possible. That's why that check was deemed to inhibit better plans in some cases without adding any real protection.

We would appreciate some brain cycles from the community on this. If our suspicion is correct, this small PR will give us better plans in many cases.

Thank you for the explanation.

My concern is that if someone ever did create a plan that didn't have the same grouping expression that this condition could apply and thus cause a very hard to debug failure.

I think we should at least add some comments to the check explaining the assumption (that a two phase grouping must have semantically the same grouping keys) to help future readers / developers. Then I think this PR is ok to merge.

I also do wonder if we have some pre-existing code to check two expressions for equality from different schemas by normalizing them or something, but I didn't try and check for it at the moment

After some thinking, I realized that since we are checking expressions equality of the subsequent operators. Output group by expressions of the first operator and input group by expressions of the second operator must be equal. I re-introduced group by equality condition with this comparison. With this comparison, we still generate better plans without relaxing the check. It can be found in the commit

alamb

Looks good to me -- thank you @mustafasrepo

alamb · 2024-06-21T16:23:40Z

datafusion/core/src/physical_optimizer/combine_partial_final_agg.rs

+    let (input_group_by, input_aggr_expr, input_filter_expr) = partial_agg;
+
+    // Compare output expressions of the partial, and input expressions of the final operator.
+    physical_exprs_equal(


alamb · 2024-06-21T16:25:38Z

I also merged this branch with main and re-ran the sqllogictests and verified they all still passed

* Minor changes * Minor changes * Re-introduce group by expression check

Minor changes

3ffb64f

github-actions bot added core Core DataFusion crate sqllogictest SQL Logic Tests (.slt) labels Jun 14, 2024

Minor changes

7e2ac25

ozankabak approved these changes Jun 15, 2024

View reviewed changes

alamb reviewed Jun 15, 2024

View reviewed changes

Re-introduce group by expression check

217c266

alamb approved these changes Jun 21, 2024

View reviewed changes

alamb merged commit 098ba30 into apache:main Jun 21, 2024
23 checks passed

xinlifoobar pushed a commit to xinlifoobar/datafusion that referenced this pull request Jun 22, 2024

Relax combine partial final rule (apache#10913)

832e58b

* Minor changes * Minor changes * Re-introduce group by expression check

xinlifoobar pushed a commit to xinlifoobar/datafusion that referenced this pull request Jun 22, 2024

Relax combine partial final rule (apache#10913)

b0d0f6c

* Minor changes * Minor changes * Re-introduce group by expression check

findepi pushed a commit to findepi/datafusion that referenced this pull request Jul 16, 2024

Relax combine partial final rule (apache#10913)

bd1a197

* Minor changes * Minor changes * Re-introduce group by expression check

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Relax combine partial final rule #10913

Relax combine partial final rule #10913

mustafasrepo commented Jun 14, 2024 •

edited

Loading

ozankabak left a comment

alamb Jun 15, 2024

alamb Jun 15, 2024

ozankabak Jun 15, 2024 •

edited

Loading

alamb Jun 17, 2024

mustafasrepo Jun 20, 2024 •

edited

Loading

alamb left a comment

alamb Jun 21, 2024

alamb commented Jun 21, 2024

Relax combine partial final rule #10913

Relax combine partial final rule #10913

Conversation

mustafasrepo commented Jun 14, 2024 • edited Loading

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

ozankabak left a comment

Choose a reason for hiding this comment

alamb Jun 15, 2024

Choose a reason for hiding this comment

alamb Jun 15, 2024

Choose a reason for hiding this comment

ozankabak Jun 15, 2024 • edited Loading

Choose a reason for hiding this comment

alamb Jun 17, 2024

Choose a reason for hiding this comment

mustafasrepo Jun 20, 2024 • edited Loading

Choose a reason for hiding this comment

alamb left a comment

Choose a reason for hiding this comment

alamb Jun 21, 2024

Choose a reason for hiding this comment

alamb commented Jun 21, 2024

mustafasrepo commented Jun 14, 2024 •

edited

Loading

ozankabak Jun 15, 2024 •

edited

Loading

mustafasrepo Jun 20, 2024 •

edited

Loading