You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, the finalizeInternal function of HashAggregate operator is performed in a single threaded manner.
For aggregations performed on large tables, the finalize becomes a significant bottleneck.
Description
Currently, the
finalizeInternal
function ofHashAggregate
operator is performed in a single threaded manner.For aggregations performed on large tables, the finalize becomes a significant bottleneck.
I'm running benchmarks on the MS MARCO dataset for FTS where we do aggregation for creating the index: https://trec-rag.github.io/annoucements/2024-corpus-finalization/
On a small segment partition (#00), the following query:
takes 134322.01ms to run and just the finalize part takes 84198 ms.
The text was updated successfully, but these errors were encountered: