-
Notifications
You must be signed in to change notification settings - Fork 590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf(common): reduce buffer reallocations in HashKey::build
#8885
Conversation
HashKey::build
HashKey::build
Before and after (around 10-15% improvement): noelkwan@Noels-MacBook-Pro benches % cargo bench --bench bench_hash_key_encoding -- "vec ser varchar" && git checkout - && cargo bench --bench bench_hash_key_encoding -- "vec ser varchar"
Compiling risingwave_common v0.2.0-alpha (/Users/noelkwan/projects/risingwave/src/common)
Finished bench [optimized] target(s) in 25.40s
Running benches/bench_hash_key_encoding.rs (/Users/noelkwan/projects/risingwave/target/release/deps/bench_hash_key_encoding-b933324b2cd47305)
vec ser varchar KeySerialized, 128 rows, Pr[null]=0
time: [25.341 µs 25.384 µs 25.428 µs]
change: [+13.522% +13.868% +14.235%] (p = 0.00 < 0.05)
Performance has regressed.
Found 5 outliers among 100 measurements (5.00%)
4 (4.00%) high mild
1 (1.00%) high severe
vec ser varchar KeySerialized, 1024 rows, Pr[null]=0
time: [203.85 µs 204.37 µs 204.86 µs]
change: [+12.867% +13.254% +13.648%] (p = 0.00 < 0.05)
Performance has regressed.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) low mild
vec ser varchar KeySerialized, 128 rows, Pr[null]=0.01
time: [25.202 µs 25.233 µs 25.266 µs]
change: [+12.170% +12.472% +12.761%] (p = 0.00 < 0.05)
Performance has regressed.
vec ser varchar KeySerialized, 1024 rows, Pr[null]=0.01
time: [185.17 µs 185.59 µs 186.01 µs]
change: [+4.6333% +7.0009% +9.1626%] (p = 0.00 < 0.05)
Performance has regressed.
Found 3 outliers among 100 measurements (3.00%)
2 (2.00%) low mild
1 (1.00%) high mild
vec ser varchar KeySerialized, 128 rows, Pr[null]=0.1
time: [20.482 µs 20.509 µs 20.537 µs]
change: [+13.448% +13.777% +14.101%] (p = 0.00 < 0.05)
Performance has regressed.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mild
vec ser varchar KeySerialized, 1024 rows, Pr[null]=0.1
time: [165.04 µs 165.43 µs 165.87 µs]
change: [+14.055% +14.376% +14.712%] (p = 0.00 < 0.05)
Performance has regressed.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mild
Switched to branch 'kwannoel/reserve-buffer'
Your branch is up to date with 'origin/kwannoel/reserve-buffer'.
Compiling risingwave_common v0.2.0-alpha (/Users/noelkwan/projects/risingwave/src/common)
Finished bench [optimized] target(s) in 25.80s
Running benches/bench_hash_key_encoding.rs (/Users/noelkwan/projects/risingwave/target/release/deps/bench_hash_key_encoding-b933324b2cd47305)
vec ser varchar KeySerialized, 128 rows, Pr[null]=0
time: [22.171 µs 22.220 µs 22.273 µs]
change: [-12.841% -12.595% -12.325%] (p = 0.00 < 0.05)
Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
6 (6.00%) high mild
vec ser varchar KeySerialized, 1024 rows, Pr[null]=0
time: [176.80 µs 177.03 µs 177.24 µs]
change: [-13.430% -13.232% -13.020%] (p = 0.00 < 0.05)
Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) low mild
1 (1.00%) high severe
vec ser varchar KeySerialized, 128 rows, Pr[null]=0.01
time: [22.198 µs 22.228 µs 22.267 µs]
change: [-11.930% -11.787% -11.626%] (p = 0.00 < 0.05)
Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
5 (5.00%) high mild
3 (3.00%) high severe
vec ser varchar KeySerialized, 1024 rows, Pr[null]=0.01
time: [158.09 µs 158.27 µs 158.44 µs]
change: [-14.925% -14.707% -14.501%] (p = 0.00 < 0.05)
Performance has improved.
Found 10 outliers among 100 measurements (10.00%)
6 (6.00%) low mild
3 (3.00%) high mild
1 (1.00%) high severe
vec ser varchar KeySerialized, 128 rows, Pr[null]=0.1
time: [17.546 µs 17.571 µs 17.597 µs]
change: [-14.388% -14.227% -14.065%] (p = 0.00 < 0.05)
Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
1 (1.00%) low mild
1 (1.00%) high mild
1 (1.00%) high severe
vec ser varchar KeySerialized, 1024 rows, Pr[null]=0.1
time: [141.64 µs 141.83 µs 142.00 µs]
change: [-14.311% -14.122% -13.934%] (p = 0.00 < 0.05)
Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
3 (3.00%) high mild |
Codecov Report
@@ Coverage Diff @@
## main #8885 +/- ##
=======================================
Coverage 70.78% 70.78%
=======================================
Files 1177 1177
Lines 195369 195382 +13
=======================================
+ Hits 138285 138296 +11
- Misses 57084 57086 +2
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 5 files with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
let estimated_key_size = data_chunk.estimate_value_encoding_size(column_idxes); | ||
// Construct serializers for each row. | ||
let mut serializers: Vec<Self::S> = hash_codes | ||
.into_iter() | ||
.map(Self::S::from_hash_code) | ||
.map(|hashcode| Self::S::from_hash_code(hashcode, estimated_key_size)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is not good enough, we can get each row's value encoding size like #8924. And after that, we can avoid any grow of the vec
I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.
What's changed and what's your intention?
Thanks to @TennyZhuang and @st1page for their investigations.
This is part of optimizations suggested by @st1page: #8802 (comment).
We estimate size of
HashKey
to avoid unnecessary reallocations fromvec
.#8683
Checklist For Contributors
./risedev check
(or alias,./risedev c
)Checklist For Reviewers
Documentation
Click here for Documentation
Types of user-facing changes
Please keep the types that apply to your changes, and remove the others.
Release note