-
Notifications
You must be signed in to change notification settings - Fork 590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: define processing time column #7209
Comments
prefer the Hidden column, it should be a record's properties but not a impure function |
My concern is that the concept "hidden column" is unfamiliar to users, especially Postgres users. For example, he/she may feel confused when writing
But It does look like |
Ok, just allowing
|
How we handle this in batch query? The time when it's scaned into our system? |
I am wondering if it also makes sense to add a time column when being materialized instead of being just read from the source? I suppose proc_time() is added to the row when it is just read. |
For a table (previously materialized source) - Yes, the processing time should be persisted in that table/MV as well, just like any other column. For a source (i.e. not materialized) - No, I suppose the only thing we can do is using the read time as processing time, as @liurenjie1024 commented above. To avoid that, users should define an event time e.g. Kafka timestamp |
Is your feature request related to a problem? Please describe.
Provide a column with event ingesting time, in case there is no appropriate time column in user data.
Describe the solution you'd like
Based on #6952, we can introduce proc_time as a column in the source, which is same with Flink: https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/concepts/time_attributes/#processing-time
Describe alternatives you've considered
1. Hidden column.
The problem of this approach is that it doesn't explicitly tell the users the column is associated with source/table. For example:
This is clear:
This is ambiguous:
2. System function
This is actually a different thing. It is apparent that functions are evaluated when executing it, rather than data injected.
It doesn't matter for this simple query
But will be a problem for more complex ones:
We had better call it
now()
or something else to distinguish from our topic here.Additional context
No response
The text was updated successfully, but these errors were encountered: