-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incorrect results returned for TPC-H Query 8 #6794
Comments
I will take a look. |
Might be related to |
That is because overflow happened in decimal divide kernel, see previous comment https://github.com/apache/arrow-datafusion/pull/5675/files#r1152896889. I should provide a scalar version of fixed point decimal multiplication kernel to fix it but haven't find time working on it. |
Currently you can only get a meaningful result by adding cast on the division, see |
I'm going to add scalar version of fixed point decimal multiplication kernel at the upstream. We can use it to fix this once it is available. |
I think this is actually a bug in the way that decimal type coercion is currently performed, which causes computations to overflow when they shouldn't - #6828 This is something I hope to fix upstream as part of apache/arrow-rs#3999 Edit: To clarify this can't easily be fixed without apache/arrow-rs#1047 as there is currently no way to propagate the precision and scale of a scalar |
That is why the internal scaling in the division kernel should be fixed point computation to allow precision loss instead of overflow. |
Can someone test this again now that #6832 has been merged? |
Opened #7233 to verify it. |
Sure. Just ran Query 8. its working well now with the latest build on main branch. Nice speed boost as well.
The |
🎉 |
Describe the bug
Datafusion gives incorrect results when running TPC-H Query 8 with parquet files.
To Reproduce
Expected behavior
The correct results should be:
Additional context
No response
The text was updated successfully, but these errors were encountered: