Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split bench, test, lib, bin into different workspaces #3360

Closed
wants to merge 5 commits into from

Conversation

breezewish
Copy link
Member

@breezewish breezewish commented Jul 26, 2018

What have you changed? (mandatory)

There is a pain point when writing benchmarks for TiKV now: we need to wait several minutes (usually around 10 minutes) even if we only added a blank line to the benchmark suite (benches directory).

This is caused by several issues:

  1. Currently the content of tikv library src/lib.rs changes at every build, because we embed a UTC build time in it. This dynamic part is now removed from lib.rs, so that it won't trigger a rebuild if its content is not changed.
  2. When changing files in the benchmark directory, cargo treats it as a change in the package and triggers a rebuild on lib.rs. By splitting benchmarks, tests, libs and binaries into different workspaces, they are isolated and won't affect each other.

In addition, this PR made some other improvements together:

  1. Removed unused imported crates.
  2. Make cargo clippy cover fuzz tests.
  3. Remove the ugly #[path = "../../tests/raftstore/..."], by introducing library to tests and benches.
  • Check profiling feature under Linux.

  • Check compile performance when only bench is changed.

    $ make bench
    cargo bench --package tikv-lib --features "default sse no-fail"  -- --nocapture && \
    	cargo bench --package tikv-bench --benches --features "default sse no-fail"  -- --nocapture && \
    	cargo run --package tikv-bench --bin tikv-bench --release --features "default sse no-fail"
        Finished release [optimized + debuginfo] target(s) in 1.34s                                                                                                                                                 
         Running target/release/deps/tikv-9d709eab255f1d37
    
    running 499 tests
    ……
    
    test result: ok. 0 passed; 0 failed; 496 ignored; 3 measured; 0 filtered out
    
       Compiling tikv-bench v0.0.1 (file:///Users/breezewish/Work/PingCAP/tikv/src/tikv_benches)                                                                                                                    
        Finished release [optimized + debuginfo] target(s) in 6.70s                                                                                                                                                 
         Running target/release/deps/tikv_bench-8523a19eecffd635
    
    running 0 tests
    
    test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
    
         Running target/release/deps/benches-e9c1f9ccbee2aec3
    
    running 12 tests
    ……

    As you can see, if we only change benchmarks in the benches directory (now it is tikv_benches), it only costs about 6 seconds to compile!

What are the type of the changes? (mandatory)

  • Improvement (non-breaking change which is an improvement to an existing feature)

How has this PR been tested? (mandatory)

CI

Does this PR affect documentation (docs/docs-cn) update? (mandatory)

No

Does this PR affect tidb-ansible update? (mandatory)

No

Refer to a related PR or issue link (optional)

Benchmark result if necessary (optional)

Add a few positive/negative examples (optional)

[profile.release]
lto = true
opt-level = 3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You removed the opt-level, on purpose?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a default setting according to cargo manifests documentation.

# TODO: remove this once rust-lang/rust#50199 is resolved.
codegen-units = 2

[profile.bench]
debug = true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Debug mode will very significantly impact performance no?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is debug symbols. I think it mostly bloats the file size, instead of slow down the runtime. Anyway, our release build enables this option as well. So in order to make it easy to debug benches, we can enable it for bench as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please check?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes of course. I will keep you updated for this one :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW I think the important thing is to keep the same profile in release and in bench. (debug=true is enabled for release) Otherwise what are we benching for :)

@Hoverbear
Copy link
Contributor

I feel like moving /benches (which is a standard convention in Rust) to a workspace project is coming the wrong way at the problem.

When I look at https://github.com/habitat-sh/habitat/ , which also uses workspaces, they do nothing like this. Instead they split components apart logically. What do you think?

On a different point:

Would you be willing to split this up and have the other PR resolve these points?

  • Currently the content of tikv library src/lib.rs changes at every build, because we embed a UTC build time in it. This dynamic part is now removed from lib.rs, so that it won't trigger a rebuild if its content is not changed.

(And maybe?...)

  • Removed unused imported crates.
  • Make cargo clippy cover fuzz tests.
  • Remove the ugly #[path = "../../tests/raftstore/..."], by introducing library to tests and benches.

Seems like this would be plenty of changes for one PR and it would make the impact of using workspaces in this way more clear.

@breezewish
Copy link
Member Author

@Hoverbear

Yes, this is my concern as well 🤣. However this PR is not splitting components (another PR should do this). Also I intentionally moved benches and tests into src/tikv_benches and src/tikv_tests to avoid someone mistakenly treat it as a Rust style. As you can see, they are managed by different workspaces so that it won't work as benches and tests of the root project (dependencies to the "tikv_test" and "tikv_bench" will not be recognized). In addition, for example, in servo, their tests are organized differently and I guess it should be Ok.

Currently the content of tikv library src/lib.rs changes at every build, because we embed a UTC build time in it. This dynamic part is now removed from lib.rs, so that it won't trigger a rebuild if its content is not changed.

This can't be extracted in a single build, because it won't work without a workspace separation 🤣In fact I tried this first, and found out that because the output of build.rs is changed, it will recompile the whole project again.

Removed unused imported crates.

This can be done in separate PRs. However each Cargo.toml will bloat to contain these unused dependencies.

Make cargo clippy cover fuzz tests.

This can be done in separate PRs. In fact this is a side effect of changing the clippy command to cover all workspaces 🤣Previously child workspaces are covered.

Remove the ugly #[path = "../../tests/raftstore/..."], by introducing library to tests and benches.

This one might be possible 🤣However it may be hard to do. I can have a try though. (and you will find that diffs may not be eliminated 🤣)

The changes above are not involved with logical changes. As long as the compiling pass, it is safe. What do you think?

@breezewish breezewish mentioned this pull request Jul 27, 2018
@siddontang
Copy link
Contributor

This is a huge change, I suggest you make an agenda and tell others why and how do you want to do @breeswish

@breezewish
Copy link
Member Author

@siddontang Ok

@Hoverbear
Copy link
Contributor

@breeswish

This can be done in separate PRs. However each Cargo.toml will bloat to contain these unused dependencies.

Oh, I better understand why this happened now. I think it's ok to include that in this PR. Sorry. :)

Yes, this is my concern as well rofl. However this PR is not splitting components (another PR should do this). Also I intentionally moved benches and tests into src/tikv_benches and src/tikv_tests to avoid someone mistakenly treat it as a Rust style.

Hm, I'd like to have a strong motivation to so clearly and strongly break conventions. I worry this will impact our ability to receive new contributions. It is already quite some effort to understand the project, and this will only add to it.

When changing files in the benchmark directory, cargo treats it as a change in the package and triggers a rebuild on lib.rs. By splitting benchmarks, tests, libs and binaries into different workspaces, they are isolated and won't affect each other.

I think this might be a bug. Can we make a reproduction case (a very minimal crate) and open an issue about it? I can do this if you're busy. :)

There is a pain point when writing benchmarks for TiKV now: we need to wait several minutes (usually around 10 minutes)

🤔 My machine spends much more time on kvproto and rocksdb then it does on TiKV. Rebuilding TiKV itself is reasonable. I think it can be faster though.

# build w/ Rocksdb + kvproto uncompiled
Finished dev [unoptimized + debuginfo] target(s) in 5m 59s
# build w/ rocksdb + kvproto compiled after touching `benches/benches.rs`                                                                                      
Finished dev [unoptimized + debuginfo] target(s) in 41.88s
# Check, same as before.
Finished dev [unoptimized + debuginfo] target(s) in 9.20s

One thing I did notice though was that tikv seems to build multiple times (Cargo actually lists tikv like 5 times in it's work queue). I wonder if this is related to the timestamp?

Building [====================================================>  ] 217/222: tikv, tikv, tikv, tikv, tikv   

I very much support this effort and would love to see progress towards better compile times and organization, but we should be careful to do it in a way that does not pile us with any technical debt to clean later. This should be a way of relieving some of that debt.

@breezewish
Copy link
Member Author

breezewish commented Jul 27, 2018

@Hoverbear

I think this might be a bug. Can we make a reproduction case (a very minimal crate) and open an issue about it? I can do this if you're busy. :)

This might be a consequence of no incremental compilation in release mode. I will take a deeper look at this.

Update: I believe this issue rust-lang/cargo#1430 involves in this behavior as well. According to the description of this issue, even if our build.rs only changes the result of binary instead of library, cargo will treat it as a change to the library.

One thing I did notice though was that tikv seems to build multiple times (Cargo actually lists tikv like 5 times in it's work queue). I wonder if this is related to the timestamp?

Actually it is Cargo building different parts (i.e. tests, libs, bins) of the tikv package. In this PR, since these parts are given different package names, you will see a better progress, like "........ tikv-bin, tikv-lib, tikv-tests".

Hm, I'd like to have a strong motivation to so clearly and strongly break conventions. I worry this will impact our ability to receive new contributions. It is already quite some effort to understand the project, and this will only add to it.

Agree 🤣

@breezewish
Copy link
Member Author

I made some new progress about accelerating the bench compiling speed. A new PR which is much more simple than this one will be proposed soon.

@breezewish breezewish closed this Jul 28, 2018
@breezewish breezewish deleted the wenxuan/_split_bench branch July 31, 2018 10:09
@breezewish
Copy link
Member Author

ref #3352

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants