Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parse real number literals as the Decimal type #12817

Open
doupache opened this issue Oct 8, 2024 · 4 comments
Open

Parse real number literals as the Decimal type #12817

doupache opened this issue Oct 8, 2024 · 4 comments
Labels
enhancement New feature or request

Comments

@doupache
Copy link
Contributor

doupache commented Oct 8, 2024

Is your feature request related to a problem or challenge?

During the fix for 12655, I found that the root cause was that we parse real number literals as f64. When performing coercion between f32 and f64, it can lead to significant precision problems.

As a solution, I proposed changing the default behavior to parse real number literals as Decimal. This not only helps avoid the precision issues between f32 and f64, but also aligns the default behavior with Postgres and DuckDB.

Postgres

SELECT 1.3 as real , pg_typeof(1.3) as type;  
real type
1.3 numeric

DuckDB

SELECT 1.3 as real , typeof(1.3) as type;  
real type
1.3 DECIMAL(2,1)

DataFusion

SELECT 1.3 as real , arrow_typeof(1.3) as type;  
real type
1.3 Float64

Fortunately, we have excellent support for Decimal, so the only change we need to make is setting the default value of sql_parser.parse_float_as_decimal to true. However, I believe this change could be breaking, as it alters the current behavior.

Therefore, it would be important to get broader community consensus before proceeding.

cc @alamb @jayzhan211 @jonahgao

Describe the solution you'd like

change sql_parser.parse_float_as_decimal to true

Describe alternatives you've considered

No response

Additional context

No response

@doupache doupache added the enhancement New feature or request label Oct 8, 2024
@andygrove
Copy link
Member

+1 for this. The current behavior is not consistent with ANSI SQL.

@jonahgao
Copy link
Member

jonahgao commented Oct 9, 2024

Makse sense to me. We should support scientific notation before this.

DataFusion CLI v42.0.0
> set datafusion.sql_parser.parse_float_as_decimal=true;
0 row(s) fetched.
Elapsed 0.001 seconds.

> select 1.1e10;
SQL error: ParserError("Cannot parse 11e10 as i128 when building decimal: invalid digit found in string")

Instead of parsing manually, I think we can use libraries like BigDecimal, or utils from arrow-rs.

@austin362667
Copy link
Contributor

+1 for this. Makes sense to me, and so is supporting scientific notation.

@jayzhan211
Copy link
Contributor

I think the challenge is that we need to make sure decimal is supported, so when we switch from float to decimal, every functions works as expected as well. Existing test might not covered all the types

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants