Update precombine benchmark to better represent varied workloads #24343

lukecwik · 2022-11-23T21:49:15Z

Represent more data distributions (hot key, uniform, normal, unique)
Run longer allowing the JIT to function
Have a random ordering of data
Use a blackhole to prevent to the JIT from optimizing away the data

Updated benchmark numbers are (note that I renamed the class before running):

Benchmark                       (distribution)  (globallyWindowed)   Mode  Cnt   Score   Error  Units
CombinerTableBenchmark.combine         uniform                true  thrpt   15  12.838 ± 0.314  ops/s
CombinerTableBenchmark.combine         uniform               false  thrpt   15   5.633 ± 0.283  ops/s
CombinerTableBenchmark.combine          normal                true  thrpt   15   6.869 ± 0.196  ops/s
CombinerTableBenchmark.combine          normal               false  thrpt   15   4.165 ± 0.271  ops/s
CombinerTableBenchmark.combine          hotKey                true  thrpt   15  13.697 ± 0.320  ops/s
CombinerTableBenchmark.combine          hotKey               false  thrpt   15   6.143 ± 0.458  ops/s
CombinerTableBenchmark.combine      uniqueKeys                true  thrpt   15   2.346 ± 0.063  ops/s
CombinerTableBenchmark.combine      uniqueKeys               false  thrpt   15   1.676 ± 0.055  ops/s

Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

Choose reviewer(s) and mention them in a comment (R: @username).
Mention the appropriate issue in your description (for example: addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment fixes #<ISSUE NUMBER> instead.
Update CHANGES.md with noteworthy changes.
If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md

GitHub Actions Tests Status (on master branch)

See CI.md for more information about GitHub Actions CI.

1. Represent more data distributions (hot key, uniform, normal, unique) 2. Run longer allowing the JIT to function 3. Have a random ordering of data 4. Use a blackhole to prevent to the JIT from optimizing away the data

lukecwik · 2022-11-23T21:50:27Z

R: @bhisevishal

github-actions · 2022-11-23T21:51:34Z

Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control

bhisevishal · 2022-11-28T21:50:47Z

Thanks @lukecwik Looks good. Is it possible to have muti threaded benchmark as well.

lukecwik · 2022-11-29T00:32:44Z

Thanks @lukecwik Looks good. Is it possible to have muti threaded benchmark as well.

I'm not sure it will provide much value but you can always configure the benchmark with the additional flag -t 4 for running it concurrently with 4 threads or add the annotation @Threads(4) to the benchmark itself (example).

Note that this benchmark only checks the combiner table and doesn't represent a full transform graph.

…he#24343) 1. Represent more data distributions (hot key, uniform, normal, unique) 2. Run longer allowing the JIT to function 3. Have a random ordering of data 4. Use a blackhole to prevent to the JIT from optimizing away the data

Update precombine bencmark to better represent varied workloads

f3af466

1. Represent more data distributions (hot key, uniform, normal, unique) 2. Run longer allowing the JIT to function 3. Have a random ordering of data 4. Use a blackhole to prevent to the JIT from optimizing away the data

github-actions bot added the java label Nov 23, 2022

lukecwik changed the title ~~Update precombine bencmark to better represent varied workloads~~ Update precombine benchmark to better represent varied workloads Nov 23, 2022

lukecwik merged commit 135007e into apache:master Nov 29, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update precombine benchmark to better represent varied workloads #24343

Update precombine benchmark to better represent varied workloads #24343

lukecwik commented Nov 23, 2022 •

edited

Loading

lukecwik commented Nov 23, 2022

github-actions bot commented Nov 23, 2022

bhisevishal commented Nov 28, 2022

lukecwik commented Nov 29, 2022

Update precombine benchmark to better represent varied workloads #24343

Update precombine benchmark to better represent varied workloads #24343

Conversation

lukecwik commented Nov 23, 2022 • edited Loading

GitHub Actions Tests Status (on master branch)

lukecwik commented Nov 23, 2022

github-actions bot commented Nov 23, 2022

bhisevishal commented Nov 28, 2022

lukecwik commented Nov 29, 2022

lukecwik commented Nov 23, 2022 •

edited

Loading