-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: date_trunc support timezone #6818
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -563,13 +563,12 @@ impl BuiltinScalarFunction { | |
BuiltinScalarFunction::ConcatWithSeparator => Ok(Utf8), | ||
BuiltinScalarFunction::DatePart => Ok(Float64), | ||
BuiltinScalarFunction::DateBin | BuiltinScalarFunction::DateTrunc => { | ||
match input_expr_types[1] { | ||
Timestamp(Nanosecond, _) | Utf8 | Null => { | ||
Ok(Timestamp(Nanosecond, None)) | ||
} | ||
Timestamp(Microsecond, _) => Ok(Timestamp(Microsecond, None)), | ||
Timestamp(Millisecond, _) => Ok(Timestamp(Millisecond, None)), | ||
Timestamp(Second, _) => Ok(Timestamp(Second, None)), | ||
match &input_expr_types[1] { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I wonder if any of the calculations actually have to change based on timezone (as wouldn't the end of the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. it truncates before applying timezone offset, which is inconsistent with postgresql ❯ set timezone to '+08:00';
0 rows in set. Query took 0.026 seconds.
❯ select timestamptz '2000-01-01T00:00:00';
+-----------------------------+
| Utf8("2000-01-01T00:00:00") |
+-----------------------------+
| 2000-01-01T00:00:00+08:00 |
+-----------------------------+
1 row in set. Query took 0.002 seconds.
❯ select date_trunc('day', timestamptz '2000-01-01T00:00:00');
+-----------------------------------------------------+
| date_trunc(Utf8("day"),Utf8("2000-01-01T00:00:00")) |
+-----------------------------------------------------+
| 1999-12-31T00:00:00 |
+-----------------------------------------------------+
1 row in set. Query took 0.032 seconds. willy=# select timestamptz '2000-01-01T00:00:00';
timestamptz
------------------------
2000-01-01 00:00:00+08
(1 row)
willy=# select date_trunc('day', timestamptz '2000-01-01T00:00:00');
date_trunc
------------------------
2000-01-01 00:00:00+08
(1 row) |
||
Utf8 | Null => Ok(Timestamp(Nanosecond, None)), | ||
Timestamp(Nanosecond, tz) => Ok(Timestamp(Nanosecond, tz.clone())), | ||
Timestamp(Microsecond, tz) => Ok(Timestamp(Microsecond, tz.clone())), | ||
Timestamp(Millisecond, tz) => Ok(Timestamp(Millisecond, tz.clone())), | ||
Timestamp(Second, tz) => Ok(Timestamp(Second, tz.clone())), | ||
_ => Err(DataFusionError::Internal(format!( | ||
"The {self} function can only accept timestamp as the second arg." | ||
))), | ||
|
@@ -945,15 +944,37 @@ impl BuiltinScalarFunction { | |
], | ||
self.volatility(), | ||
), | ||
BuiltinScalarFunction::DateTrunc => Signature::one_of( | ||
vec![ | ||
Exact(vec![Utf8, Timestamp(Nanosecond, None)]), | ||
Exact(vec![Utf8, Timestamp(Microsecond, None)]), | ||
Exact(vec![Utf8, Timestamp(Millisecond, None)]), | ||
Exact(vec![Utf8, Timestamp(Second, None)]), | ||
], | ||
self.volatility(), | ||
), | ||
BuiltinScalarFunction::DateTrunc => { | ||
let time_zones = vec![ | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe @waitingkuo has some ideas about how to deal with timezones in function signatures. It seems not quite right to have to list all timezones specially for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am not sure there is a better way to provide the timezone here. And wait for @waitingkuo suggestion. Also, it should be extended to other functions using timezone. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. FWIW this issue also arises for decimal types, I suspect that the current pattern of explicitly listing the possible type signatures is going to require revisiting. We probably want to extend the pattern established in #6778 and allow functions to return a signature given a set of input types. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
"-12:00", "-11:00", "-10:00", "-09:30", "-09:00", "-08:00", "-07:00", | ||
"-06:00", "-05:00", "-04:30", "-04:00", "-03:30", "-03:00", "-02:00", | ||
"-01:00", "+00:00", "+01:00", "+02:00", "+03:00", "+03:30", "+04:00", | ||
"+04:30", "+05:00", "+05:30", "+05:45", "+06:00", "+06:30", "+07:00", | ||
"+08:00", "+08:30", "+08:45", "+09:00", "+09:30", "+10:00", "+10:30", | ||
"+11:00", "+11:30", "+12:00", "+12:45", "+13:00", "+14:00", | ||
]; | ||
let time_units = vec![ | ||
TimeUnit::Second, | ||
TimeUnit::Millisecond, | ||
TimeUnit::Microsecond, | ||
TimeUnit::Nanosecond, | ||
]; | ||
let mut signatures = vec![]; | ||
for unit in time_units { | ||
signatures.push(TypeSignature::Exact(vec![ | ||
Utf8, | ||
Timestamp(unit.clone(), None), | ||
])); | ||
for tz in &time_zones { | ||
signatures.push(TypeSignature::Exact(vec![ | ||
Utf8, | ||
Timestamp(unit.clone(), Some((*tz).into())), | ||
])); | ||
} | ||
} | ||
|
||
Signature::one_of(signatures, self.volatility()) | ||
} | ||
BuiltinScalarFunction::DateBin => { | ||
let base_sig = |array_type: TimeUnit| { | ||
vec![ | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think we should add a test case that truncate the day