Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specialize Prefix/Suffix Match for Like/ILike between Array and Scalar for StringViewArray #6231

Merged
merged 11 commits into from
Aug 25, 2024

Conversation

xinlifoobar
Copy link
Contributor

@xinlifoobar xinlifoobar commented Aug 13, 2024

Which issue does this PR close?

Parts of #5951.

Rationale for this change

What changes are included in this PR?

Are there any user-facing changes?

@github-actions github-actions bot added parquet Changes to the parquet crate arrow Changes to the arrow crate labels Aug 13, 2024
@xinlifoobar
Copy link
Contributor Author

xinlifoobar commented Aug 13, 2024

Bench from my dev machine

# xinli @ arch-dev in ~/source/repos/arrow-rs on git:dev/xinli1/optimize_prefix o [11:42:12] C:130
$ uname -a
Linux arch-dev 6.10.3-zen1-2-zen #1 ZEN SMP PREEMPT_DYNAMIC Tue, 06 Aug 2024 07:47:21 +0000 x86_64 GNU/Linux

# xinli @ arch-dev in ~/source/repos/arrow-rs on git:dev/xinli1/optimize_prefix o [12:33:29] C:130
$ critcmp master optimize_prefix optimize_prefix_boxed_iter                         
group                               master                                 optimize_prefix                        optimize_prefix_boxed_iter
-----                               ------                                 ---------------                        --------------------------
like_utf8view scalar complex        1.01    172.7±1.45ms        ? ?/sec    1.00    170.3±1.47ms        ? ?/sec    1.00    171.0±1.55ms        ? ?/sec
like_utf8view scalar contains       1.04    129.3±6.07ms        ? ?/sec    1.01    126.2±1.46ms        ? ?/sec    1.00    124.9±1.44ms        ? ?/sec
like_utf8view scalar ends with      1.02     37.6±0.45ms        ? ?/sec    1.00     36.7±0.38ms        ? ?/sec    1.02     37.6±0.39ms        ? ?/sec
like_utf8view scalar equals         1.01     26.2±0.42ms        ? ?/sec    1.02     26.5±0.38ms        ? ?/sec    1.00     26.0±0.27ms        ? ?/sec
like_utf8view scalar starts with    1.90     32.8±1.58ms        ? ?/sec    1.00     17.2±0.45ms        ? ?/sec    1.89     32.5±0.45ms        ? ?/sec

@xinlifoobar
Copy link
Contributor Author

Notablely this PR is just for startwith/istartwith like between array and scalar. Array vs array is more ticky, I am looking into some feasible options...

@xinlifoobar
Copy link
Contributor Author

I read through the conditions here. Ideally, when the lhs is a scalar and rhs is an array, use op_scalar with reversed order should be faster than making a scalar iterator?

https://github.com/apache/arrow-rs/blob/a693f0f9c37567b2b121e261fc0a4587776d5ca4/arrow-string/src/like.rs#L204C1-L221C14

CC @alamb @XiangpengHao

@xinlifoobar
Copy link
Contributor Author

Bench from my dev machine

# xinli @ arch-dev in ~/source/repos/arrow-rs on git:dev/xinli1/optimize_prefix o [11:42:12] C:130
$ uname -a
Linux arch-dev 6.10.3-zen1-2-zen #1 ZEN SMP PREEMPT_DYNAMIC Tue, 06 Aug 2024 07:47:21 +0000 x86_64 GNU/Linux

# xinli @ arch-dev in ~/source/repos/arrow-rs on git:dev/xinli1/optimize_prefix o [12:33:29] C:130
$ critcmp master optimize_prefix optimize_prefix_boxed_iter                         
group                               master                                 optimize_prefix                        optimize_prefix_boxed_iter
-----                               ------                                 ---------------                        --------------------------
like_utf8view scalar complex        1.01    172.7±1.45ms        ? ?/sec    1.00    170.3±1.47ms        ? ?/sec    1.00    171.0±1.55ms        ? ?/sec
like_utf8view scalar contains       1.04    129.3±6.07ms        ? ?/sec    1.01    126.2±1.46ms        ? ?/sec    1.00    124.9±1.44ms        ? ?/sec
like_utf8view scalar ends with      1.02     37.6±0.45ms        ? ?/sec    1.00     36.7±0.38ms        ? ?/sec    1.02     37.6±0.39ms        ? ?/sec
like_utf8view scalar equals         1.01     26.2±0.42ms        ? ?/sec    1.02     26.5±0.38ms        ? ?/sec    1.00     26.0±0.27ms        ? ?/sec
like_utf8view scalar starts with    1.90     32.8±1.58ms        ? ?/sec    1.00     17.2±0.45ms        ? ?/sec    1.89     32.5±0.45ms        ? ?/sec

@xinlifoobar
Copy link
Contributor Author

xinlifoobar commented Aug 13, 2024

Bench from my dev machine

# xinli @ arch-dev in ~/source/repos/arrow-rs on git:dev/xinli1/optimize_prefix o [11:42:12] C:130
$ uname -a
Linux arch-dev 6.10.3-zen1-2-zen #1 ZEN SMP PREEMPT_DYNAMIC Tue, 06 Aug 2024 07:47:21 +0000 x86_64 GNU/Linux

# xinli @ arch-dev in ~/source/repos/arrow-rs on git:dev/xinli1/optimize_prefix o [12:33:29] C:130
$ critcmp master optimize_prefix optimize_prefix_boxed_iter                         
group                               master                                 optimize_prefix                        optimize_prefix_boxed_iter
-----                               ------                                 ---------------                        --------------------------
like_utf8view scalar complex        1.01    172.7±1.45ms        ? ?/sec    1.00    170.3±1.47ms        ? ?/sec    1.00    171.0±1.55ms        ? ?/sec
like_utf8view scalar contains       1.04    129.3±6.07ms        ? ?/sec    1.01    126.2±1.46ms        ? ?/sec    1.00    124.9±1.44ms        ? ?/sec
like_utf8view scalar ends with      1.02     37.6±0.45ms        ? ?/sec    1.00     36.7±0.38ms        ? ?/sec    1.02     37.6±0.39ms        ? ?/sec
like_utf8view scalar equals         1.01     26.2±0.42ms        ? ?/sec    1.02     26.5±0.38ms        ? ?/sec    1.00     26.0±0.27ms        ? ?/sec
like_utf8view scalar starts with    1.90     32.8±1.58ms        ? ?/sec    1.00     17.2±0.45ms        ? ?/sec    1.89     32.5±0.45ms        ? ?/sec

I did the benchmark on fixing the msrv issue. Either boxed iter or vector has hit the performance badly..

@xinlifoobar xinlifoobar reopened this Aug 13, 2024
@alamb
Copy link
Contributor

alamb commented Aug 13, 2024

2x faster on starts_with. not bad!

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @xinlifoobar -- this is pretty cool

I think ti would be cooler if we can figure out how to use StringView to help prefixes that are up to 12 bytes long as well

arrow-string/src/like.rs Outdated Show resolved Hide resolved
arrow-string/src/like.rs Outdated Show resolved Hide resolved
arrow-string/src/predicate.rs Outdated Show resolved Hide resolved
@github-actions github-actions bot removed the parquet Changes to the parquet crate label Aug 15, 2024
@xinlifoobar
Copy link
Contributor Author

xinlifoobar commented Aug 15, 2024

Here is an updated benchmark for the latest code. It indicates the optimizations only work on the first 4/12 bytes. Any time it reaches the buffer, the perf is down. Given the result, I suspect it won't work on complex cases like regex. I will test them though.

# xinli @ arch-dev in ~/source/repos/arrow-rs on git:dev/xinli1/optimize_prefix o [22:23:48] 
$ critcmp master optimize_prefix                                                    
group                                                 master                                 optimize_prefix
-----                                                 ------                                 ---------------
like_utf8view scalar complex                          1.00    174.5±2.43ms        ? ?/sec    1.01    176.0±2.07ms        ? ?/sec
like_utf8view scalar contains                         1.00    129.2±1.50ms        ? ?/sec    1.02    131.6±5.38ms        ? ?/sec
like_utf8view scalar ends with                        1.01     38.0±0.68ms        ? ?/sec    1.00     37.7±0.38ms        ? ?/sec
like_utf8view scalar equals                           1.00     26.2±0.34ms        ? ?/sec    1.00     26.2±0.37ms        ? ?/sec
like_utf8view scalar starts with                      1.63     32.6±0.40ms        ? ?/sec    1.00     20.0±0.29ms        ? ?/sec
like_utf8view scalar starts with more than 4 bytes    1.07     33.8±0.38ms        ? ?/sec    1.00     31.7±0.28ms        ? ?/sec

arrow/Cargo.toml Outdated Show resolved Hide resolved
@alamb
Copy link
Contributor

alamb commented Aug 15, 2024

I am hoping to find time to review this in more detail tomorrow

@xinlifoobar
Copy link
Contributor Author

I got some better results on other string view predicates. Will update them in batch tonight.

@xinlifoobar xinlifoobar force-pushed the dev/xinli1/optimize_prefix branch from 4f0f49a to 894e797 Compare August 17, 2024 14:28
@xinlifoobar
Copy link
Contributor Author

Seems the previous result are generated by falut code. Let me do more iterations for this.

@xinlifoobar
Copy link
Contributor Author

xinlifoobar commented Aug 19, 2024

Updated the Benchmark results for the latest version.

$ uname -a
Linux arch-dev 6.10.4-zen2-1-zen #1 ZEN SMP PREEMPT_DYNAMIC Sun, 11 Aug 2024 16:18:46 +0000 x86_64 GNU/Linux

# xinli @ arch-dev in ~/source/repos/arrow-rs on git:dev/xinli/prefix_v2 x [17:34:42] 
$ critcmp master_08_19 optimized_prefix_suffix                                              
group                                                 master_08_19                           optimized_prefix_suffix
-----                                                 ------------                           -----------------------
like_utf8view scalar complex                          1.06   184.5±15.76ms        ? ?/sec    1.00    174.7±2.58ms        ? ?/sec
like_utf8view scalar contains                         1.03    130.7±3.52ms        ? ?/sec    1.00    127.5±2.69ms        ? ?/sec
like_utf8view scalar ends with                        1.10     37.8±0.54ms        ? ?/sec    1.00     34.3±0.56ms        ? ?/sec
like_utf8view scalar equals                           1.00     26.4±0.59ms        ? ?/sec    1.03     27.2±0.94ms        ? ?/sec
like_utf8view scalar starts with                      1.59     32.8±0.55ms        ? ?/sec    1.00     20.6±0.65ms        ? ?/sec
like_utf8view scalar starts with more than 4 bytes    1.07     34.0±0.53ms        ? ?/sec    1.00     31.9±0.44ms        ? ?/sec

@xinlifoobar
Copy link
Contributor Author

xinlifoobar commented Aug 19, 2024

Observations Based on Changes:

  1. The inline implementation of start_withs resulted in notable performance improvements.
  2. The suffix_iter is slightly helpful on perf and should be able to help on memory since we don't have to have the whole str and reverse it (logically).
  3. Switching from &str to &[u8] showed only marginal performance gains (a few milliseconds), with a trade-off in flexibility and readability. Since str is essentially a byte container, the benefits do not justify the loss in code clarity.
  4. no significant performance difference exists between using BooleanArray::from::<Vec<bool>> and BooleanArray::fromUnary.

@xinlifoobar xinlifoobar marked this pull request as ready for review August 19, 2024 09:27
@alamb alamb changed the title Implement Prefix Match for Like/ILike between Array and Scalar Specialize Prefix Match for Like/ILike between Array and Scalar for StringViewArray Aug 19, 2024
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @xinlifoobar -- other than the suffix_iter producing potentially invalid &str I think this PR looks good to me

arrow-array/src/array/byte_view_array.rs Outdated Show resolved Hide resolved
arrow-string/src/predicate.rs Outdated Show resolved Hide resolved
@xinlifoobar
Copy link
Contributor Author

Some minor improvements after use &[u8] directly.

like_utf8view scalar complex
----------------------------
optimized_prefix_suffix_bytes     1.00     171.9±2.78ms       ? ?/sec
optimized_prefix_suffix           1.02     174.7±2.58ms       ? ?/sec
master_08_19                      1.07    184.5±15.76ms       ? ?/sec

like_utf8view scalar contains
-----------------------------
optimized_prefix_suffix_bytes     1.00     124.7±1.66ms       ? ?/sec
optimized_prefix_suffix           1.02     127.5±2.69ms       ? ?/sec
master_08_19                      1.05     130.7±3.52ms       ? ?/sec

like_utf8view scalar ends with
------------------------------
optimized_prefix_suffix_bytes     1.00      33.6±0.52ms       ? ?/sec
optimized_prefix_suffix           1.02      34.3±0.56ms       ? ?/sec
master_08_19                      1.12      37.8±0.54ms       ? ?/sec

like_utf8view scalar equals
---------------------------
master_08_19                      1.00      26.4±0.59ms       ? ?/sec
optimized_prefix_suffix_bytes     1.01      26.7±0.56ms       ? ?/sec
optimized_prefix_suffix           1.03      27.2±0.94ms       ? ?/sec

like_utf8view scalar starts with
--------------------------------
optimized_prefix_suffix_bytes     1.00      20.4±0.27ms       ? ?/sec
optimized_prefix_suffix           1.01      20.6±0.65ms       ? ?/sec
master_08_19                      1.61      32.8±0.55ms       ? ?/sec

like_utf8view scalar starts with more than 4 bytes
--------------------------------------------------
optimized_prefix_suffix_bytes     1.00      31.8±0.35ms       ? ?/sec
optimized_prefix_suffix           1.00      31.9±0.44ms       ? ?/sec
master_08_19                      1.07      34.0±0.53ms       ? ?/sec

@xinlifoobar xinlifoobar changed the title Specialize Prefix Match for Like/ILike between Array and Scalar for StringViewArray Specialize Prefix/Suffix Match for Like/ILike between Array and Scalar for StringViewArray Aug 20, 2024
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @xinlifoobar -- I am running the benchmarks on this branch and will report back

arrow-string/src/predicate.rs Outdated Show resolved Hide resolved
// 😈 is four bytes long.
test_utf8_scalar!(
test_uff8_array_like_multibyte,
vec![
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 it occurs to me we should also be testing with Options as well (aka the test data should have nulls)

let len = (*v as u32) as usize;

if len < prefix_len {
return &[] as &[u8];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as you mentioned above, having to return an empty slice just for the function to immediate check it again might be another potential performance improvement

What do you think about making this more general and take a function? Maybe something like the following (untested)

    /// Applies function `f` to the first `prefix_len` bytes for all views
    /// if the view length is less tha prefix_len func is invoked with None(T)
    pub fn prefix_bytes_iter<F, T>(&self, prefix_len: usize, func: F) -> impl Iterator<Item = T> 
    where
       F: FnMut(Option<&[u8]>) -> T
  {
...
}

I am not sure this is a good idea but figured maybe it would be more general. But maybe not...

Copy link
Contributor Author

@xinlifoobar xinlifoobar Aug 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought passing a function pointer to the *_iters was a bad decision. I did this actually in the first version of this PR, e.g.,

pub fn predicate(&self, func: F) -> Impl ArrayRef
where F: FnMut(Option<&[u8]>) -> T
{
}

# or 

pub fn predicate_prefix(&self, func: F) -> Impl ArrayRef
where F: FnMut(Option<&[u8]>) -> T
{
}

This was good, but a circular on the crate dependencies was introduced, i.e.,

# past
Predicate --evaluate_array--> Array

# after
Predicate --evaluate_array--> Array --predicate--> Predicate Function --evaluate--> Array Item.

This could be solved by re-layouting the code but lots of changes there.

Also, the functions are very specialized, as they should not be. The function signature is not flexible enough to generalize all such requirements.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense -- thank you for the explanation. Let's keep exploring this method for now

@alamb
Copy link
Contributor

alamb commented Aug 20, 2024

🤔 the benchmarks fail now for me like


Benchmarking eq scalar StringViewArray: Warming up for 3.0000 sthread 'main' panicked at arrow/benches/comparison_kernels.rs:196:50:
called `Result::unwrap()` on an `Err` value: InvalidArgumentError("Invalid comparison operation: Utf8 == Utf8View")
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

error: bench failed, to rerun pass `-p arrow --bench comparison_kernels`

@xinlifoobar
Copy link
Contributor Author

xinlifoobar commented Aug 21, 2024

🤔 the benchmarks fail now for me like


Benchmarking eq scalar StringViewArray: Warming up for 3.0000 sthread 'main' panicked at arrow/benches/comparison_kernels.rs:196:50:
called `Result::unwrap()` on an `Err` value: InvalidArgumentError("Invalid comparison operation: Utf8 == Utf8View")
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

error: bench failed, to rerun pass `-p arrow --bench comparison_kernels`

This issue could be repro on the master branch... I looked into the history the benchmark shouldn't work at the time it was checked in. Comment out this bench and everything works then...

https://github.com/alamb/arrow-rs/blob/8941cbf5325b380bf70ea1ee5950f570a102c873/arrow-ord/src/cmp.rs#L235-L239

@xinlifoobar
Copy link
Contributor Author

xinlifoobar commented Aug 21, 2024

🤔 the benchmarks fail now for me like


Benchmarking eq scalar StringViewArray: Warming up for 3.0000 sthread 'main' panicked at arrow/benches/comparison_kernels.rs:196:50:
called `Result::unwrap()` on an `Err` value: InvalidArgumentError("Invalid comparison operation: Utf8 == Utf8View")
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

error: bench failed, to rerun pass `-p arrow --bench comparison_kernels`

This issue could be repro on the master branch... I looked into the history the benchmark shouldn't work at the time it was checked in. Comment out this bench and everything works then...

https://github.com/alamb/arrow-rs/blob/8941cbf5325b380bf70ea1ee5950f570a102c873/arrow-ord/src/cmp.rs#L235-L239

This would be a more complex fix than expected. The following functions, including appy, apply_op* are expected the lhs and rhs are of the same data type. How about doing convertions beforehand?

@alamb
Copy link
Contributor

alamb commented Aug 21, 2024

This issue could be repro on the master branch... I looked into the history the benchmark shouldn't work at the time it was checked in. Comment out this bench and everything works then...

Thanks @xinlifoobar -- indeed this does appear to be an issue on the master branch. I filed #6283 and will fix it shortly

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again for @xinlifoobar. I think this PR is looking quite nice now.

I am sorry for all the back and forth with this PR, but I think we are on good track now. Doing performance optimizations always seems to take much longer than I think/hope it should :)

I think the high level structure / idea of this PR is looking good

What I think is needed next is to run the benchmarks and see how much better this branch is than master (and validate if bytes_iter() makes a difference, for example)

I think once we have merged #6284 and merged up to this branch we should be able to test it out.

.collect::<Vec<_>>(),
)
} else {
BooleanArray::from_unary(array, |haystack| {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking more carefully at BooleanArray::from_unary it will use the ArrayAccessor impl for StringViewArray

impl<'a, T: ByteViewType + ?Sized> ArrayAccessor for &'a GenericByteViewArray<T> {
type Item = &'a T::Native;
fn value(&self, index: usize) -> Self::Item {
GenericByteViewArray::value(self, index)
}
unsafe fn value_unchecked(&self, index: usize) -> Self::Item {
GenericByteViewArray::value_unchecked(self, index)
}
}

It isn't clear to me that how calling bytes_iter() would make this faster as the code for value_unchecked is the same as butes_iter

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the test should be we'll try benchmarking and see if this improves things

Copy link
Contributor Author

@xinlifoobar xinlifoobar Aug 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ya, I thought the only differences between bytes_iter and ArrayAccessor is

For bytes_iter

self.views().has_next()? -> self.views().next() -> value_unchecked()

For ArrayAccessor

index = index + 1 -> self.views.get_unchecked(idx) -> str(value_unchecked()).as_bytes()

There are merely differences between the indexing operations and iterator methods. The benchmark also indicates in %5 ranges.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made a PR here to refine this documentation: #6306

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another difference is that bytes_iterator() iterates over all array slots, including those that are null

let len = (*v as u32) as usize;

if len < prefix_len {
return &[] as &[u8];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense -- thank you for the explanation. Let's keep exploring this method for now

@xinlifoobar
Copy link
Contributor Author

xinlifoobar commented Aug 22, 2024

New benchmark results. It looks like the suffix_iter proves itself here.

# xinli @ arch-dev in ~/source/repos/arrow-rs on git:dev/xinli/prefix_v2 o [13:29:59] 
$ uname -a 
Linux arch-dev 6.10.4-zen2-1-zen #1 ZEN SMP PREEMPT_DYNAMIC Sun, 11 Aug 2024 16:18:46 +0000 x86_64 GNU/Linux

# xinli @ arch-dev in ~/source/repos/arrow-rs on git:dev/xinli/prefix_v2 o [13:30:07] 
$ critcmp master_08_22 optimized_like_08_22     
group                                        master_08_22                           optimized_like_08_22
-----                                        ------------                           --------------------
like_utf8view scalar complex                 1.01    175.9±6.78ms        ? ?/sec    1.00    173.7±3.55ms        ? ?/sec
like_utf8view scalar contains                1.02    131.7±1.71ms        ? ?/sec    1.00    128.6±2.73ms        ? ?/sec
like_utf8view scalar ends with 13 bytes      1.07     33.1±0.54ms        ? ?/sec    1.00     30.9±0.54ms        ? ?/sec
like_utf8view scalar ends with 4 bytes       1.16     37.1±0.65ms        ? ?/sec    1.00     32.0±0.74ms        ? ?/sec
like_utf8view scalar ends with 6 bytes       1.21     38.4±0.57ms        ? ?/sec    1.00     31.8±1.08ms        ? ?/sec
like_utf8view scalar equals                  1.00     27.2±0.88ms        ? ?/sec    1.01     27.6±1.25ms        ? ?/sec
like_utf8view scalar starts with 13 bytes    1.00     30.7±0.53ms        ? ?/sec    1.00     30.7±0.67ms        ? ?/sec
like_utf8view scalar starts with 4 bytes     1.56     34.5±1.64ms        ? ?/sec    1.00     22.1±0.40ms        ? ?/sec
like_utf8view scalar starts with 6 bytes     1.11     34.9±0.62ms        ? ?/sec    1.00     31.3±0.53ms        ? ?/sec

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TLDR is I think this is good to go. Thank you @xinlifoobar this is very nice

It would be nice to avoid the need for bytes_iter if possible before release. I will explore doing so in a follow on PR

I ran the benchmarks again 👨‍🍳 👌

++ critcmp master dev_xinli1_optimize_prefix
group                                                     dev_xinli1_optimize_prefix             master
-----                                                     --------------------------             ------
ilike_utf8 scalar complex                                 1.00      2.7±0.08ms        ? ?/sec    1.00      2.7±0.09ms        ? ?/sec
ilike_utf8 scalar contains                                1.00      4.2±0.07ms        ? ?/sec    1.00      4.2±0.08ms        ? ?/sec
ilike_utf8 scalar ends with                               1.00  1235.1±38.98µs        ? ?/sec    1.00  1240.9±41.24µs        ? ?/sec
ilike_utf8 scalar equals                                  1.00   773.0±23.01µs        ? ?/sec    1.01   780.1±20.32µs        ? ?/sec
ilike_utf8 scalar starts with                             1.00  1132.5±25.82µs        ? ?/sec    1.00  1135.2±39.42µs        ? ?/sec
ilike_utf8_scalar_dyn dictionary[10] string[4])           1.00     88.3±0.16µs        ? ?/sec    1.00     88.2±0.09µs        ? ?/sec
like_utf8 scalar complex                                  1.01  1876.9±53.81µs        ? ?/sec    1.00  1859.7±27.02µs        ? ?/sec
like_utf8 scalar contains                                 1.00  1726.5±15.40µs        ? ?/sec    1.03  1774.5±19.87µs        ? ?/sec
like_utf8 scalar ends with                                1.03    440.5±6.68µs        ? ?/sec    1.00   426.8±13.04µs        ? ?/sec
like_utf8 scalar equals                                   1.00     90.9±0.22µs        ? ?/sec    1.39    126.8±0.13µs        ? ?/sec
like_utf8 scalar starts with                              1.03    341.8±5.19µs        ? ?/sec    1.00    333.4±3.99µs        ? ?/sec
like_utf8_scalar_dyn dictionary[10] string[4])            1.00     88.2±0.18µs        ? ?/sec    1.00     88.0±0.11µs        ? ?/sec
like_utf8view scalar complex                              1.03    183.7±1.27ms        ? ?/sec    1.00    179.0±0.62ms        ? ?/sec
like_utf8view scalar contains                             1.00    129.9±0.28ms        ? ?/sec    1.04    135.4±0.22ms        ? ?/sec
like_utf8view scalar ends with 13 bytes                   1.00     43.7±0.27ms        ? ?/sec    1.15     50.5±0.22ms        ? ?/sec
like_utf8view scalar ends with 4 bytes                    1.00     44.7±0.16ms        ? ?/sec    1.22     54.4±0.21ms        ? ?/sec
like_utf8view scalar ends with 6 bytes                    1.00     44.7±0.23ms        ? ?/sec    1.24     55.5±0.11ms        ? ?/sec
like_utf8view scalar equals                               1.00     32.3±0.09ms        ? ?/sec    1.07     34.5±0.07ms        ? ?/sec
like_utf8view scalar starts with 13 bytes                 1.00     45.9±0.15ms        ? ?/sec    1.00     46.1±0.30ms        ? ?/sec
like_utf8view scalar starts with 4 bytes                  1.00     25.2±0.11ms        ? ?/sec    1.93     48.6±0.10ms        ? ?/sec
like_utf8view scalar starts with 6 bytes                  1.00     46.3±0.21ms        ? ?/sec    1.07     49.5±0.19ms        ? ?/sec

@alamb
Copy link
Contributor

alamb commented Aug 25, 2024

🚀 -- thanks again for sticking with this @xinlifoobar

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants