Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Internal error in regexp_replace() for some StringView input (SQLancer) #12150

Closed
2010YOUY01 opened this issue Aug 24, 2024 · 3 comments · Fixed by #12203
Closed

Internal error in regexp_replace() for some StringView input (SQLancer) #12150

2010YOUY01 opened this issue Aug 24, 2024 · 3 comments · Fixed by #12203
Assignees
Labels
bug Something isn't working

Comments

@2010YOUY01
Copy link
Contributor

Describe the bug

One query can run successfully on a table with a regular string column
If we convert this string column's physical representation to StringView, the query failed

See reproducer in datafusion-cli
(Compiled from latest main using cargo run, commit a58416c)

The last query is supposed to run successfully like the previous one

DataFusion CLI v41.0.0
> create table t1(v1 text);
0 row(s) fetched.
Elapsed 0.058 seconds.

> insert into t1 values ('DataFusion'), ('datafusion');
+-------+
| count |
+-------+
| 2     |
+-------+
1 row(s) fetched.
Elapsed 0.047 seconds.

> create table t1_stringview as
select arrow_cast(v1, 'Utf8View') as v1
from t1;
0 row(s) fetched.
Elapsed 0.011 seconds.

# Now we have two equivalent tables `t1` and `t1_stringview`
# The difference is physical representation for string column (StringArray and StringViewArray)

> select regexp_replace(v1,lower(v1),'bar') from t1;
+------------------------------------------------+
| regexp_replace(t1.v1,lower(t1.v1),Utf8("bar")) |
+------------------------------------------------+
| DataFusion                                     |
| bar                                            |
+------------------------------------------------+
2 row(s) fetched.
Elapsed 0.014 seconds.

> select regexp_replace(v1,lower(v1),'bar') from t1_stringview;
Internal error: could not cast value to arrow_array::array::byte_array::GenericByteArray<arrow_array::types::GenericStringType<i32>>.
This was likely caused by a bug in DataFusion's code and we would welcome that you file an bug report in our issue tracker

To Reproduce

No response

Expected behavior

No response

Additional context

Found by SQLancer #11030

@2010YOUY01 2010YOUY01 added the bug Something isn't working label Aug 24, 2024
@2010YOUY01
Copy link
Contributor Author

Maybe this can be fixed together while working on #11912

@devanbenz
Copy link
Contributor

take

@devanbenz
Copy link
Contributor

Disregard my closed PRs 😅 I accidently committed a kind binary that was in my DF folder 😆

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment