-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add additional regexp functions #11946
Comments
I could work on this. The only concern is whether we implement the regexp function in this project or in Hey @alamb, would you prefer implement function in |
Thanks @xinlifoobar I would personally recommend we start implementing them in datafusion as that will avoid the need to wait for coordinated releases of arrow-rs, and then port backupstream to arrow-rs as a follow on step. |
@xinlifoobar I suspect there will be several other contribtuors interested in helping out and learning during the process. If we have a good example to follow the work would be straightforward to scale I think One way to do this might be:
|
Related to this, The currently accepted argument types are:
Postgres's regex substring takes a string, a pattern, and an escape character, so I don't think there would be a conflict. |
Spark's version of this is https://spark.apache.org/docs/latest/api/sql/#regexp_substr |
Based on prior conversations it sounds like the group is most interested in making sure we are supporting Postgresql so I think adding this is a very good idea. We can also have |
It is a good idea to support these functions, especially for |
Adding |
I took a look at the syntax for that pg function, and frankly it's awful. Personally I think that is a function best ignored. |
I see your point for the postgres - since it essentially does either substring from this character index to that OR regex matching, I think it would probably add more confusion than value to support in exactly the way they do. However I'm far from a SQL expert. |
Is your feature request related to a problem or challenge?
I would like to see the following regexp functions implemented. These exist in some, but not all, versions of PostgreSQL.
Describe the solution you'd like
Implement these functions.
Describe alternatives you've considered
These operations can be performed using the existing functions, so I am currently unblocked for my immediate use case but having these functions built in would be convenient.
Additional context
We currently have the following regexp functions implemented. The source is in
datafusion/functions/src/regex/mod.rs
regexp_like()
regexp_match()
regexp_replace()
The text was updated successfully, but these errors were encountered: