Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support unicode character for initcap function #13752

Merged
merged 6 commits into from
Dec 22, 2024

Conversation

tlm365
Copy link
Contributor

@tlm365 tlm365 commented Dec 13, 2024

Which issue does this PR close?

Closes #13711.

Rationale for this change

What changes are included in this PR?

  • Support UTF-8 for initcap function.
  • Moved initcap from string module to unicode module.
  • Update docs.

Are these changes tested?

Yes, unit tests are included.

Are there any user-facing changes?

Yes. api change.

@github-actions github-actions bot added the sqllogictest SQL Logic Tests (.slt) label Dec 13, 2024
@github-actions github-actions bot added the documentation Improvements or additions to documentation label Dec 13, 2024
@tlm365 tlm365 marked this pull request as ready for review December 14, 2024 00:26
Copy link
Contributor

@2010YOUY01 2010YOUY01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, this looks good to me

@@ -74,7 +76,7 @@ impl ScalarUDFImpl for InitcapFunc {
DataType::LargeUtf8 => make_scalar_function(initcap::<i64>, vec![])(args),
DataType::Utf8View => make_scalar_function(initcap_utf8view, vec![])(args),
other => {
exec_err!("Unsupported data type {other:?} for function initcap")
exec_err!("Unsupported data type {other:?} for function `initcap`")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
exec_err!("Unsupported data type {other:?} for function `initcap`")
internal_err!("Unsupported data type {other:?} for function `initcap`")

Incompatible arg should be checked during planning before, thus here is unreachable, and we can use internal_err to indicate a potential bug if it's executed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@2010YOUY01 Thanks for reviewing,

Incompatible arg should be checked during planning before, thus here is unreachable, and we can use internal_err to indicate a potential bug if it's executed

Make sense to me👍

@@ -460,7 +460,7 @@ Andrew Datafusion📊🔥
Xiangpeng Datafusion数据融合
Raphael Datafusionдатафусион
Under_Score Un Iść Core
Percent Pan Tadeusz Ma Iść W KąT
Percent Pan Tadeusz Ma Iść W Kąt
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've checked and this result is compatible with PostgreSQL

Copy link
Contributor

@goldmedal goldmedal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @tlm365 It looks good to me but some minor comments.

@@ -213,7 +243,7 @@ mod tests {
Ok(Some("Hi Thomas")),
&str,
Utf8,
StringArray
StringViewArray
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

datafusion/functions/src/unicode/initcap.rs Outdated Show resolved Hide resolved
datafusion/functions/src/unicode/initcap.rs Outdated Show resolved Hide resolved
@tlm365 tlm365 marked this pull request as draft December 17, 2024 13:47
Signed-off-by: Tai Le Manh <[email protected]>
@tlm365 tlm365 marked this pull request as ready for review December 22, 2024 07:36
@goldmedal goldmedal merged commit a267784 into apache:main Dec 22, 2024
27 checks passed
@goldmedal
Copy link
Contributor

Thanks @tlm365 and @2010YOUY01 for reviewing

zhuqi-lucas pushed a commit to zhuqi-lucas/arrow-datafusion that referenced this pull request Dec 23, 2024
* Support unicode character for 'initcap' function

Signed-off-by: Tai Le Manh <[email protected]>

* Update unit tests

* Fix clippy warning

* Update sqllogictests - initcap

* Update scalar_functions.md docs

* Add suggestions change

Signed-off-by: Tai Le Manh <[email protected]>

---------

Signed-off-by: Tai Le Manh <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation functions sqllogictest SQL Logic Tests (.slt)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support unicode character for initcap function
3 participants