You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem or challenge?
Currently, RepartitionExec is implemented with a custom MPSC, based on the parking_lot. However, this implementation has poor performance and may become a bottleneck in some queries, when the number of input/out partitions is large.
Describe the solution you'd like
We could use a lock-free MPSC, like flume, to improve the performance.
Describe alternatives you've considered
No response
Additional context
I have implemented my idea, and the benchmark on tpch shows it could accelerate the query:
Comparing main and feature_flume
Benchmark tpch.json
Query
main
feature_flume
Change
QQuery 1
317.52ms
317.86ms
no change
QQuery 2
73.18ms
70.41ms
no change
QQuery 3
136.38ms
113.01ms
+1.21x faster
QQuery 4
84.27ms
51.30ms
+1.64x faster
QQuery 5
170.56ms
123.28ms
+1.38x faster
QQuery 6
83.52ms
81.93ms
no change
QQuery 7
249.60ms
220.84ms
+1.13x faster
QQuery 8
191.66ms
175.73ms
+1.09x faster
QQuery 9
282.38ms
213.37ms
+1.32x faster
QQuery 10
230.92ms
153.20ms
+1.51x faster
QQuery 11
52.68ms
54.10ms
no change
QQuery 12
153.50ms
119.72ms
+1.28x faster
QQuery 13
314.86ms
313.01ms
no change
QQuery 14
115.02ms
115.82ms
no change
QQuery 15
90.32ms
89.26ms
no change
QQuery 16
67.44ms
61.57ms
+1.10x faster
QQuery 17
785.40ms
786.18ms
no change
QQuery 18
636.27ms
491.24ms
+1.30x faster
QQuery 19
232.26ms
231.82ms
no change
QQuery 20
261.95ms
240.57ms
+1.09x faster
QQuery 21
351.81ms
239.96ms
+1.47x faster
QQuery 22
54.88ms
49.39ms
+1.11x faster
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem or challenge?
Currently,
RepartitionExec
is implemented with a custom MPSC, based on the parking_lot. However, this implementation has poor performance and may become a bottleneck in some queries, when the number of input/out partitions is large.Describe the solution you'd like
We could use a lock-free MPSC, like
flume
, to improve the performance.Describe alternatives you've considered
No response
Additional context
I have implemented my idea, and the benchmark on tpch shows it could accelerate the query:
Comparing main and feature_flume
Benchmark tpch.json
The text was updated successfully, but these errors were encountered: