refactor(common): unify implementation of hash key #9671

BugenZhao · 2023-05-08T12:10:24Z

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

See #9659 for the background of this PR. Almost rewrite the implementation of HashKey and unify the implementation into a single struct with generics.

The motivation for this large refactor is...

for variant length hash-key, we misuse the value-encoding, which is wrong since it does not respect Eq
memcomparable is inefficient since it provides extra Ord
actually we have already maintained a HashKeySerde (but only used and implemented for Copy scalar types), let's extend it and make it the source of truth of hash key encoding
the original trait definition on ScalarRef won't compile for non-Copy scalar types; also for efficiency, do not return a Vec and copy it again for variant length types: let's directly write into the BufMut: so we need to redesign this trait
now the encodings are exactly the same for FixedSizeKey and SerializedKey, and the only difference is the memory backend (stack array vs heap vector), let's extract it.

That's it. 😇

In the meantime, this PR tweaks the performance carefully to avoid unnecessary boxing or match as much as possible, while also fixing some bad patterns. So we get performance improvement on most micro-benchmark cases.

vec ser 1024 rows, bz, int64 Key64, Pr[null]=0
                        time:   [12.110 µs 12.458 µs 12.879 µs]
                        change: [-6.5550% -3.0257% +0.7303%] (p = 0.11 > 0.05)
                        No change in performance detected.
Found 17 outliers among 100 measurements (17.00%)
  7 (7.00%) high mild
  10 (10.00%) high severe

vec deser 1024 rows, bz, int64 Key64, Pr[null]=0
                        time:   [6.6426 µs 6.7088 µs 6.7763 µs]
                        change: [-19.691% -18.668% -17.624%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  2 (2.00%) low mild
  1 (1.00%) high mild

row deser 1024 rows, bz, int64 Key64, Pr[null]=0
                        time:   [25.302 µs 25.378 µs 25.451 µs]
                        change: [-85.429% -85.329% -85.227%] (p = 0.00 < 0.05)
                        Performance has improved. <- 😄

vec ser 1024 rows, bz, int64 Key64, Pr[null]=0.1
                        time:   [34.646 µs 36.310 µs 37.973 µs]
                        change: [+16.574% +22.875% +29.704%] (p = 0.00 < 0.05)
                        Performance has regressed. <- 🥵 needs further investigation

vec deser 1024 rows, bz, int64 Key64, Pr[null]=0.1
                        time:   [7.8786 µs 7.9717 µs 8.0645 µs]
                        change: [-11.989% -10.932% -9.7430%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild

row deser 1024 rows, bz, int64 Key64, Pr[null]=0.1
                        time:   [26.320 µs 26.349 µs 26.375 µs]
                        change: [-85.175% -85.086% -84.995%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) low mild

vec ser 1024 rows, bz, varchar KeySerialized, Pr[null]=0
                        time:   [119.24 µs 120.75 µs 122.10 µs]
                        change: [-34.032% -32.912% -31.859%] (p = 0.00 < 0.05)
                        Performance has improved.

vec deser 1024 rows, bz, varchar KeySerialized, Pr[null]=0
                        time:   [92.858 µs 93.598 µs 94.434 µs]
                        change: [-5.2924% -4.4017% -3.4356%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

row deser 1024 rows, bz, varchar KeySerialized, Pr[null]=0
                        time:   [110.16 µs 111.25 µs 112.34 µs]
                        change: [-2.3327% -1.6156% -0.8521%] (p = 0.00 < 0.05)
                        Change within noise threshold.

vec ser 1024 rows, bz, varchar KeySerialized, Pr[null]=0.1
                        time:   [123.36 µs 124.63 µs 125.93 µs]
                        change: [-25.129% -23.725% -22.271%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) low mild
  1 (1.00%) high mild

vec deser 1024 rows, bz, varchar KeySerialized, Pr[null]=0.1
                        time:   [86.680 µs 87.380 µs 88.106 µs]
                        change: [-5.9269% -4.8970% -3.9472%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

row deser 1024 rows, bz, varchar KeySerialized, Pr[null]=0.1
                        time:   [101.22 µs 101.75 µs 102.29 µs]
                        change: [-2.7365% -2.1937% -1.6627%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

vec ser 1024 rows, bz, mix fixed and not fixed, case 2 KeySerialized, Pr[null]=0
                        time:   [151.72 µs 152.54 µs 153.40 µs]
                        change: [-23.467% -22.793% -22.183%] (p = 0.00 < 0.05)
                        Performance has improved.

vec deser 1024 rows, bz, mix fixed and not fixed, case 2 KeySerialized, Pr[null]=0
                        time:   [103.82 µs 104.59 µs 105.35 µs]
                        change: [-9.6591% -8.6094% -7.3838%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) high mild
  1 (1.00%) high severe

row deser 1024 rows, bz, mix fixed and not fixed, case 2 KeySerialized, Pr[null]=0
                        time:   [115.71 µs 116.80 µs 117.83 µs]
                        change: [-2.3239% -1.6436% -0.9661%] (p = 0.00 < 0.05)
                        Change within noise threshold.

vec ser 1024 rows, bz, mix fixed and not fixed, case 2 KeySerialized, Pr[null]=0.1
                        time:   [181.64 µs 184.24 µs 186.88 µs]
                        change: [+0.3257% +2.9275% +5.6019%] (p = 0.03 < 0.05)
                        Change within noise threshold.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) high mild
  1 (1.00%) high severe

vec deser 1024 rows, bz, mix fixed and not fixed, case 2 KeySerialized, Pr[null]=0.1
                        time:   [95.270 µs 95.779 µs 96.291 µs]
                        change: [-13.247% -12.014% -10.842%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  2 (2.00%) high mild
  2 (2.00%) high severe

row deser 1024 rows, bz, mix fixed and not fixed, case 2 KeySerialized, Pr[null]=0.1
                        time:   [109.97 µs 110.85 µs 111.69 µs]
                        change: [+1.8472% +2.5985% +3.3017%] (p = 0.00 < 0.05)
                        Performance has regressed.

But it's still worth noting that #9305 needs to be totally redone since hash encoding is not value encoding. Let's do this in the next PRs.

Close #9659.

Checklist For Contributors

I have written necessary rustdoc comments
I have added necessary unit tests and integration tests
I have added fuzzing tests or opened an issue to track them. (Optional, recommended for new SQL features Sqlsmith: Sql feature generation #7934).
I have demonstrated that backward compatibility is not broken by breaking changes and created issues to track deprecated features to be removed in the future. (Please refer to the issue)
All checks passed in ./risedev check (or alias, ./risedev c)

Checklist For Reviewers

I have requested macro/micro-benchmarks as this PR can affect performance substantially, and the results are shown.

Documentation

My PR DOES NOT contain user-facing changes.

Click here for Documentation

Types of user-facing changes

Please keep the types that apply to your changes, and remove the others.

Installation and deployment
Connector (sources & sinks)
SQL commands, functions, and operators
RisingWave cluster configuration changes
Other (please specify in the release note below)

Release note

github-actions

license-eye has totally checked 3400 files.

Valid	Invalid	Ignored	Fixed
1548	1	1851	0

Click to see the invalid file list

src/common/src/hash/key_v2.rs

src/common/src/hash/key_v2.rs

Signed-off-by: Bugen Zhao <[email protected]>

kwannoel · 2023-05-10T09:06:16Z

src/common/Cargo.toml

@@ -77,6 +77,7 @@ strum = "0.24"
 strum_macros = "0.24"
 sysinfo = { version = "0.26", default-features = false }
 thiserror = "1"
+tinyvec = { version = "1", features = ["rustc_1_55", "grab_spare_slice"] }


Why need rustc_1_55 and grab_spare_slice? Maybe should leave some comments inline.

This is because some interfaces of tinyvec are only available under these features. Should be documented by this crate.

kwannoel · 2023-05-10T09:15:12Z

src/common/src/hash/key.rs

+                    .expect("in-memory deserialize should never fail")
+                    .expect("datum should never be NULL");
+
+                // TODO: extra unboxing from `ScalarRefImpl` is unnecessary here.


Not quite sure I understand. Why is there boxing and unboxing for ScalarRefImpl?

By "boxing" here we mean erasing the concrete type of scalar and putting it into some variant of ScalarImpl. In general, a "boxed" scalar cannot be directly used and requires another match to take the real scalar so there's extra overhead.

kwannoel · 2023-05-10T09:28:59Z

src/common/src/hash/key_v2.rs

+}
+
+/// The buffer for building a hash key on a fixed-size byte array on the stack.
+pub struct StackBuffer<const N: usize>(ArrayVec<[u8; N]>);


Why not directly use array

Yep. An ArrayVec is exactly an array with a "current length" on which some methods are encapsulated. We use ArrayVec simply for clearer implementation of BufMut. 🤣

kwannoel · 2023-05-10T09:31:32Z

src/common/src/hash/key_v2.rs

+pub type Key256<B = HeapNullBitmap> = FixedSizeKey<32, B>;
+pub type KeySerialized<B = HeapNullBitmap> = SerializedKey<B>;
+
+pub type SerializedKey<B = HeapNullBitmap> = HashKeyImpl<HeapStorage, B>;


Wonder if it will be confusing to have KeySerialized and SerializedKey 😆 . But I don't have a better idea now.

These're copied from the original implementation. This PR aims to keep all of the interfaces untouched so the changes are all in the common crate. Let's refactor the code structure along with this in next PRs.

kwannoel

Generally LGTM. I will make another pass later. Thanks for this!

BugenZhao · 2023-05-10T12:15:56Z

Interesting. Where's the difference? 👀

BugenZhao · 2023-05-10T12:31:45Z

Oops. There's really some difference in the decimal precision. This seems to be an expected behavior caused by this PR, but is it correct? 👀 cc @xiangjinwu

st1page · 2023-05-10T16:53:48Z

FYI, I think PG15's behavior does not give any guarantee.

dev=# create table t_d(v decimal);
dev=# insert into t_d values (3), (3.0), (3.00);
INSERT 0 3
dev=# select * from t_d;
  v   
------
    3
  3.0
 3.00
(3 rows)

dev=# select * from t_d group by v;
 v 
---
 3
(1 row)

dev=# delete from t_d;
DELETE 3
dev=# insert into t_d values (3.0), (3.00);
INSERT 0 2
dev=# select * from t_d group by v;
  v  
-----
 3.0
(1 row)

dev=# delete from t_d;
DELETE 2
dev=# insert into t_d values (3), (3.00);
INSERT 0 2
dev=# select * from t_d group by v;
 v 
---
 3
(1 row)


dev=# delete from t_d;
DELETE 2
dev=# insert into t_d values (3.00), (3);
INSERT 0 2
dev=# select * from t_d group by v;
  v   
------
 3.00
(1 row)

xiangjinwu · 2023-05-11T00:56:02Z

Yes it is expected. It would have been without trailing zero if we did not have the bug or the decimal column did not follow a varchar column.
#3873 is exactly what @st1page found above.

Just surfacing some related issues: #3863 #4953

Signed-off-by: Bugen Zhao <[email protected]>

st1page

LGTM. It seems that some other bugs are exposed after this PR 🤣

BugenZhao · 2023-05-11T09:27:28Z

FYI, I think PG15's behavior does not give any guarantee.

Looks like PostgreSQL always picks the value of the first row. For the sake of determinism in our system, I guess it's reasonable for us to always output the normalized form.

Signed-off-by: Bugen Zhao <[email protected]>

codecov · 2023-05-11T10:04:52Z

Codecov Report

Merging #9671 (15fdcac) into main (47fe210) will decrease coverage by 0.01%.
The diff coverage is 67.94%.

@@            Coverage Diff             @@
##             main    #9671      +/-   ##
==========================================
- Coverage   71.02%   71.01%   -0.01%     
==========================================
  Files        1241     1242       +1     
  Lines      208154   208091      -63     
==========================================
- Hits       147844   147785      -59     
+ Misses      60310    60306       -4

Flag	Coverage Δ
rust	`71.01% <67.94%> (-0.01%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
src/common/src/array/mod.rs	`70.54% <18.18%> (-2.89%)`	⬇️
src/common/src/hash/key.rs	`81.59% <45.00%> (+0.16%)`	⬆️
src/common/src/types/interval.rs	`80.71% <57.14%> (-0.08%)`	⬇️
src/common/src/hash/key_v2.rs	`89.87% <89.87%> (ø)`

... and 4 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

stdrc

LGTM!

github-actions bot added the type/refactor label May 8, 2023

github-actions bot reviewed May 8, 2023

View reviewed changes

src/common/src/hash/key_v2.rs Show resolved Hide resolved

BugenZhao closed this May 9, 2023

BugenZhao marked this pull request as ready for review May 9, 2023 14:21

BugenZhao reopened this May 9, 2023

BugenZhao added nexmark-q17 ci/run-cpu-flamegraph-cron Runs the cron workflow for generating cpu flamegraph cpu_flamegraph Generates a cpu flamegraph and removed ci/run-cpu-flamegraph-cron Runs the cron workflow for generating cpu flamegraph labels May 9, 2023

BugenZhao added 11 commits May 10, 2023 13:37

initial commit

00940b7

Signed-off-by: Bugen Zhao <[email protected]>

complete implementation

357d8d9

Signed-off-by: Bugen Zhao <[email protected]>

cleanup dead codes

86edfb7

Signed-off-by: Bugen Zhao <[email protected]>

extract key storage

1bf4f9c

Signed-off-by: Bugen Zhao <[email protected]>

optionally dispatch by hash key size

df7f3c3

Signed-off-by: Bugen Zhao <[email protected]>

minor rename

3c04f3b

Signed-off-by: Bugen Zhao <[email protected]>

write serde all in the trait

81575e8

Signed-off-by: Bugen Zhao <[email protected]>

minor optimization

38c5c9e

Signed-off-by: Bugen Zhao <[email protected]>

refine

ecea436

Signed-off-by: Bugen Zhao <[email protected]>

avoid matching

ade83c6

Signed-off-by: Bugen Zhao <[email protected]>

refactor serde impl

91a74d2

Signed-off-by: Bugen Zhao <[email protected]>

BugenZhao force-pushed the bz/refactor-hash-key branch from 282f580 to 91a74d2 Compare May 10, 2023 05:39

BugenZhao added 2 commits May 10, 2023 15:12

add doc and fix clippy

6acba19

Signed-off-by: Bugen Zhao <[email protected]>

add more doc

0b83465

Signed-off-by: Bugen Zhao <[email protected]>

BugenZhao removed cpu_flamegraph Generates a cpu flamegraph nexmark-q17 labels May 10, 2023

refine implementation

76bbde5

Signed-off-by: Bugen Zhao <[email protected]>

BugenZhao requested review from liurenjie1024, fuyufjh, xiangjinwu and st1page May 10, 2023 08:49

BugenZhao requested review from kwannoel and wangrunji0408 May 10, 2023 08:49

kwannoel reviewed May 10, 2023

View reviewed changes

xxchan mentioned this pull request May 10, 2023

enhancement: better diff for result mismatch risinglightdb/sqllogictest-rs#176

Open

fix decimal precision

4de4dff

Signed-off-by: Bugen Zhao <[email protected]>

st1page approved these changes May 11, 2023

View reviewed changes

fix more tests

15fdcac

Signed-off-by: Bugen Zhao <[email protected]>

BugenZhao force-pushed the bz/refactor-hash-key branch from 7de6c67 to 15fdcac Compare May 11, 2023 09:45

BugenZhao added this pull request to the merge queue May 11, 2023

stdrc approved these changes May 11, 2023

View reviewed changes

Merged via the queue into main with commit 77bc5f9 May 11, 2023

BugenZhao deleted the bz/refactor-hash-key branch May 11, 2023 11:31

BugenZhao mentioned this pull request May 15, 2023

perf: refine hash key encoding size estimation #9811

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(common): unify implementation of hash key #9671

refactor(common): unify implementation of hash key #9671

BugenZhao commented May 8, 2023 •

edited

Loading

github-actions bot left a comment

kwannoel May 10, 2023

BugenZhao May 10, 2023

kwannoel May 10, 2023

BugenZhao May 10, 2023

kwannoel May 10, 2023

BugenZhao May 10, 2023

kwannoel May 10, 2023

BugenZhao May 10, 2023

kwannoel left a comment

BugenZhao commented May 10, 2023

BugenZhao commented May 10, 2023

st1page commented May 10, 2023

xiangjinwu commented May 11, 2023

st1page left a comment

BugenZhao commented May 11, 2023

codecov bot commented May 11, 2023 •

edited

Loading

stdrc left a comment

refactor(common): unify implementation of hash key #9671

refactor(common): unify implementation of hash key #9671

Conversation

BugenZhao commented May 8, 2023 • edited Loading

What's changed and what's your intention?

Checklist For Contributors

Checklist For Reviewers

Documentation

Types of user-facing changes

Release note

github-actions bot left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kwannoel left a comment

Choose a reason for hiding this comment

BugenZhao commented May 10, 2023

BugenZhao commented May 10, 2023

st1page commented May 10, 2023

xiangjinwu commented May 11, 2023

st1page left a comment

Choose a reason for hiding this comment

BugenZhao commented May 11, 2023

codecov bot commented May 11, 2023 • edited Loading

Codecov Report

stdrc left a comment

Choose a reason for hiding this comment

BugenZhao commented May 8, 2023 •

edited

Loading

codecov bot commented May 11, 2023 •

edited

Loading