You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
While working to enable StringView use more widely in DataFusion in apache/datafusion#11723 I found this cast function was not supported:
Specifically, create a BinaryViewArray and then call cast to cast it to Utf8:
cast(binary_view_array,&DataType::Utf8)
External error: query failed: DataFusion error: Error during planning: Cannot cast file schema field string_col of type BinaryView to table schema field of type Utf8
I think this came about if a column is marked as "binary" in a parqut file and DataFusion tries to read it in as a Utf8 column the reader will be unbappy
Describe the solution you'd like
Add the support to the cast kernel for BinaryView -> utf8
@RinChanNOWWW added most support in #5704 and I think we can simply use the cast_view_to_byte function to build the correct StringArray
// Workaround arrow-rs bug in can_cast_types// External error: query failed: DataFusion error: Arrow error: Cast error: Casting from BinaryView to Utf8 not supportedfncan_cast_types(from_type:&DataType,to_type:&DataType) -> bool{
arrow::compute::can_cast_types(from_type, to_type)
|| matches!((from_type, to_type),(DataType::BinaryView,DataType::Utf8 | DataType::LargeUtf8)
| (DataType::Utf8 | DataType::LargeUtf8,DataType::BinaryView))}// Work around arrow-rs casting bug// External error: query failed: DataFusion error: Arrow error: Cast error: Casting from BinaryView to Utf8 not supportedfncast(array:&dynArray,to_type:&DataType) -> Result<ArrayRef,ArrowError>{match(array.data_type(), to_type){(DataType::BinaryView,DataType::Utf8) => {let array = array.as_binary_view();letmut builder = StringBuilder::with_capacity(array.len(),8*1024);for value in array.iter(){// check if the value is valid utf8 (should do this once, not each value)let value = value.map(|value| std::str::from_utf8(value)).transpose()?;
builder.append_option(value);}Ok(Arc::new(builder.finish()))}// fallback to arrow kernel(_, _) => arrow::compute::cast(array, to_type),}}
Part of #6163
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
While working to enable
StringView
use more widely in DataFusion in apache/datafusion#11723 I found this cast function was not supported:Specifically, create a
BinaryViewArray
and then callcast
to cast it to Utf8:I think this came about if a column is marked as "binary" in a parqut file and DataFusion tries to read it in as a Utf8 column the reader will be unbappy
Describe the solution you'd like
Add the support to the cast kernel for
BinaryView
-> utf8@RinChanNOWWW added most support in #5704 and I think we can simply use the
cast_view_to_byte
function to build the correctStringArray
Describe alternatives you've considered
Additional context
FYI @XiangpengHao
The text was updated successfully, but these errors were encountered: