Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize take kernel for BinaryViewArray and StringViewArray #6168

Merged
merged 3 commits into from
Aug 2, 2024
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 3 additions & 5 deletions arrow-select/src/take.rs
Original file line number Diff line number Diff line change
Expand Up @@ -487,11 +487,9 @@ fn take_byte_view<T: ByteViewType, IndexType: ArrowPrimitiveType>(
) -> Result<GenericByteViewArray<T>, ArrowError> {
let new_views = take_native(array.views(), indices);
let new_nulls = take_nulls(array.nulls(), indices);
Ok(GenericByteViewArray::new(
new_views,
array.data_buffers().to_vec(),
new_nulls,
))
Ok(unsafe {
a10y marked this conversation as resolved.
Show resolved Hide resolved
GenericByteViewArray::new_unchecked(new_views, array.data_buffers().to_vec(), new_nulls)
})
}

/// `take` implementation for list arrays
Expand Down
36 changes: 36 additions & 0 deletions arrow/benches/take_kernels.rs
Original file line number Diff line number Diff line change
Expand Up @@ -149,6 +149,42 @@ fn add_benchmark(c: &mut Criterion) {
b.iter(|| bench_take(&values, &indices))
});

let values = create_string_view_array(512, 0.0);
let indices = create_random_index(512, 0.0);
c.bench_function("take stringview 512", |b| {
b.iter(|| bench_take(&values, &indices))
});

let values = create_string_view_array(1024, 0.0);
let indices = create_random_index(1024, 0.0);
c.bench_function("take stringview 1024", |b| {
b.iter(|| bench_take(&values, &indices))
});

let values = create_string_view_array(512, 0.0);
let indices = create_random_index(512, 0.5);
c.bench_function("take stringview null indices 512", |b| {
b.iter(|| bench_take(&values, &indices))
});

let values = create_string_view_array(1024, 0.0);
let indices = create_random_index(1024, 0.5);
c.bench_function("take stringview null indices 1024", |b| {
b.iter(|| bench_take(&values, &indices))
});

let values = create_string_view_array(1024, 0.5);
let indices = create_random_index(1024, 0.0);
c.bench_function("take stringview null values 1024", |b| {
b.iter(|| bench_take(&values, &indices))
});

let values = create_string_view_array(1024, 0.5);
let indices = create_random_index(1024, 0.5);
c.bench_function("take stringview null values null indices 1024", |b| {
b.iter(|| bench_take(&values, &indices))
});

let values = create_primitive_run_array::<Int32Type, Int32Type>(1024, 512);
let indices = create_random_index(1024, 0.0);
c.bench_function(
Expand Down
28 changes: 28 additions & 0 deletions arrow/src/util/bench_util.rs
Original file line number Diff line number Diff line change
Expand Up @@ -160,6 +160,34 @@ pub fn create_string_array_with_len<Offset: OffsetSizeTrait>(
.collect()
}

/// Creates a random (but fixed-seeded) string view array of a given size and null density.
///
/// See `create_string_array` above for more details.
pub fn create_string_view_array(size: usize, null_density: f32) -> StringViewArray {
create_string_view_array_with_max_len(size, null_density, 400)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using a max len of 400 means most of the string values will be "long" (not inlined views which happens for values less than 12 bytes in length). I don't think that is critical I am just pointing it out

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep it's a good point. I do think that the change is going to be more impactful for outlined vs inlined strings, as validation time is linear w.r.t. length of string.

Though, even for the relatively short strings in TPC-H, the validation step was really significant (~22% of execution time)

I have a flamegraph for q10 in spiraldb/vortex#476 (comment).

}

/// Creates a random (but fixed-seeded) array of rand size with a given max size, null density and length
fn create_string_view_array_with_max_len(
size: usize,
null_density: f32,
max_str_len: usize,
) -> StringViewArray {
let rng = &mut seedable_rng();
(0..size)
.map(|_| {
if rng.gen::<f32>() < null_density {
None
} else {
let str_len = rng.gen_range(0..max_str_len);
let value = rng.sample_iter(&Alphanumeric).take(str_len).collect();
let value = String::from_utf8(value).unwrap();
Some(value)
}
})
.collect()
}

/// Creates a random (but fixed-seeded) array of a given size, null density and length
pub fn create_string_view_array_with_len(
size: usize,
Expand Down
Loading