You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
jon-chuang
changed the title
bug(expr): Substr of unicode: "byte index _ is not a char boundary"
bug(expr): SUBSTR of unicode produces error: "byte index _ is not a char boundary"
Apr 9, 2023
We should not calculate by bytes, but by unicode character:
To reproduce:
Risingwave:
PSQL:
Similar for
''Mér'::char(3)
resources:
src/expr/src/vector_op/substr.rs:48:23
risingwave/src/expr/src/vector_op/substr.rs
Line 37 in 95ab15c
https://stackoverflow.com/questions/4249745/does-postgresql-varchar-count-using-unicode-character-length-or-ascii-character
The text was updated successfully, but these errors were encountered: