Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make util::Progress thread-safe as prerequisite of #11448 #11602

Closed
wants to merge 1 commit into from

Conversation

Byron
Copy link
Member

@Byron Byron commented Jan 20, 2023

This is a pre-requisite for obtaining progress information from gitoxde which is integrated via this PR.

Obtaining progress information works by having a thread poll gitoxide's progress in regular intervals.

Alternatives Considered

gitoxide hands progress information to a Progress trait which allows it to build hierarchies of progress information to reflect potentially concurrent state updates. It also assumes that progress updates are fast and doesn't rate-limit its own calls.

Implementing this trait for Progress would have to involve rate limiting, which typically uses the local time to determine if a call will cause a display update or not, which at least in the past has been prohibitively slow. By now, at least on linux, I think this has been fixed but I am not sure about Windows.

As there are uncertainties to this approach and which also comes with considerable higher complexity (and more code), I decided to instead make util::Progress thread-safe with the minimal amount of changes.

@rustbot
Copy link
Collaborator

rustbot commented Jan 20, 2023

r? @ehuss

(rustbot has picked a reviewer for you, use r? to override)

@rustbot rustbot added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jan 20, 2023
@Byron Byron mentioned this pull request Jan 20, 2023
16 tasks
@Byron Byron force-pushed the threadsafe-progress branch from 4b561ab to 950cbcc Compare January 22, 2023 10:26
@@ -50,6 +50,7 @@ libgit2-sys = "=0.14.1"
memchr = "2.1.3"
opener = "0.5"
os_info = "3.5.0"
parking_lot = "0.12.1"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you say why this needs parking_lot?
I'm generally reluctant to bring in new dependencies unless there is a strong reason. They increase build times, increase build and porting hazards, and potentially increase maintenance.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From #11448

It's both. I used std::sync::Mutex in an earlier iteration but found myself writing .unwrap() or .expect("thread didn't panic") which seemed worse than using parking_lot explicitly which is already part of the dependency tree.

Except I just looked and do not see parking_lot in the dependency tree

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I think this might have come up in the other PR but wasn't addressed.

I chose it to avoid dealing with lock poisoning. Not using it either meant that fn -> T has to become fn -> Result<T> where ever the lock is obtained, or that I'd have to use unwrap() or expect("no panic in thread"), both of which seemed like the something to avoid.

Is there other options you see, or a choice you would make from the alternatives presented here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Except I just looked and do not see parking_lot in the dependency tree

It's coming in with gitoxide.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typically for dealing with that I just call .unwrap(). The chance of having a poisoned lock is small if you keep the locked region small, and avoid other panics within it. And the consequences are usually small, unless you try to lock a mutex in a drop() method, which I would usually not recommend (though it looks like this PR does that).

It is certainly not ideal, and there has been a lot of discussion on whether or not to actually try to introduce a different API. There is risk with the design of parking_lot since invariants may no longer hold if some locked region only partially completed its work.

I think it should probably be OK to bring it in if it will be required by gitoxide.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should probably be OK to bring it in if it will be required by gitoxide.

I left parking_lot in for now but will be happy to change to an std::sync::Mutex with unwrap() if there are any more concerns. Just because gitoxide uses parking_lot in some capacity doesn't mean it has to be used here - there are good reasons for lock poisoning and I don't understand the tradeoffs that parking_lot made, but merely chose the simple route assuming that it's probably alright if so many others do it - clearly some sort of fallacy.

Copy link
Contributor

@ehuss ehuss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if an alternate approach would be to use a message queue? Cargo's own multi-threaded section does that (including handling progress). I'm not sure I follow the alternative discussion around a Progress trait. Does gitoxide generate progress updates at a very high rate? Could Cargo's own Throttle type be used to help here?

I am a little concerned about the risk of deadlock, or future changes thinking it is OK to grab a Shell from a thread without considering what other threads are doing with the console output. Perhaps that risk is low, but this is a little concerning.

Also, what have you done to test these changes? There are very few tests for the progress bar, so we need to do some manual verification.

Comment on lines 221 to 222
if self.shell.lock().is_cleared() || self.last_line.as_ref() != Some(&line) {
let mut shell = self.shell.lock();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like it should lock the shell outside of the if, otherwise something could run in-between the two lock calls.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a great catch, I have made the adjustment.

Comment on lines 235 to 236
if self.last_line.is_some() && !self.shell.lock().is_cleared() {
self.shell.lock().err_erase_line();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly about having a single lock.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot, the adjustment was made.

@@ -393,8 +394,13 @@ impl Config {
}

/// Gets a reference to the shell, e.g., for writing error messages.
pub fn shell(&self) -> RefMut<'_, Shell> {
self.shell.borrow_mut()
pub fn shell(&self) -> MutexGuard<'_, Shell> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the docstring here include some warnings about the restrictions the caller should be careful about? For example, this should not be called in the same thread if that thread already has a shell (otherwise it would deadlock).

Overall I'm a bit nervous about this since there won't be any compile time checks, and there are a lot of places that get a shell. I don't think I can review all of those call sites. I think the current code is probably ok, otherwise the RefCel would have panic'ed, but a panic is a lot better than a deadlock.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good call, and I see the issue with going RefCell to Mutex as well. So much so that I'd also love to not have to do that.

For now I have added a big deadlock notice.

This is a pre-requisite for obtaining progress information from `gitoxde`
which is integrated [via this PR](rust-lang#11448).

Obtaining progress information works by having a thread poll `gitoxide`'s progress
in regular intervals.
@Byron Byron force-pushed the threadsafe-progress branch from 950cbcc to 383d767 Compare January 27, 2023 06:31
@Byron
Copy link
Member Author

Byron commented Jan 27, 2023

Thanks @ehuss for the review and suggestions - I will take another stab at a good alternative which is offloading the computation to a thread and keep the ProgressBar in the main thread, feeding it with messages. I am pretty sure I tried that but it failed because some other type couldn't be used in a thread that was needed.
However, I just looked at it again and believe that this dependency can be removed to make it work. I will give it another shot asap.

Does gitoxide generate progress updates at a very high rate? Could Cargo's own Throttle type be used to help here?

Yes, it will, but with Throttle it should be fine. Implementing custom Progress is doable, but the code will definitely be harder to maintain due to increased complexity.

To sum it up: I also think this change introduces some insecurity into the code-base that is best avoided, and it will be worth it to try harder to instead offload the gitixode operation into its own thread to have the main thread process progress instead. It should be possible, and I will be back with the results which hopefully leads to the closure of this PR.

@Byron
Copy link
Member Author

Byron commented Jan 27, 2023

It took me a while but I managed to manufacture a version that offloads the gitoxide fetch into a separate thread and processes the progress in the main thread 🎉. To me this seems like the solution I always wanted. The required modification will be put on top of the integration PR.

@Byron Byron closed this Jan 27, 2023
bors added a commit that referenced this pull request Mar 2, 2023
gitoxide integration: fetch

This PR is the first step towards resolving #1171.

In order to get there, we integrate `gitoxide` into `cargo` in such a way that one can control its usage in nightly via `-Zgitoxide` or `Zgitoxide=<feature>[,featureN]`.

Planned features are:

* **fetch** - all fetches are done with `gitxide` (this PR)
* **shallow_index** - the crates index will be a shallow clone (_planned_)
* **shallow_deps** - git dependencies will be a shallow clone (_planned_)
* **checkout** - plain checkouts with `gitoxide` (_planned_)

The above list is a prediction and might change as we understand the requirements better.

### Testing and Transitioning

By default, everything stays as is. However, relevant tests can be re-runwith `gitoxide` using

```
RUSTFLAGS='--cfg always_test_gitoxide' cargo test git
```

There are about 200 tests with 'git' in their name and I plan to enable them one by one. That way the costs for CI stay managable (my first measurement with one test was 2min 30s), while allowing to take one step at a time.

Custom tests shall be added once we realize that more coverage is needed.

That way we should be able to maintain running `git2` and `gitoxide` side by side until we are willing to switch over to `gitoxide` entirely on stable cargo. Then turning on `git2` might be a feature toggle for a while until we finally remove it from the codebase.

_Please see the above paragraph as invitation for discussion, it's merely a basis to explore from and improve upon._

### Tasks

* [x] add feature toggle
* [x] setup test system with one currently successful test
* [x] implement fetch with `gitoxide` (MVP)
* [x] fetch progress
* [x] detect spurious errors
* [x] enable as many git tests as possible (and ignore what's not possible)
* [x] fix all git-related test failures (except for 1: built-in upload-pack, skipped for now)
* [x] validate that all HTTP handle options that come from `cargo` specific values are passed to `gitoxide`
* [x] a test to validate `git2` code can handle crates-index clones created with `gitoxide` and vice-versa
* [x] remove patches that enabled `gitoxide` enabled testing - it's not used anymore
* [x] ~~remove all TODOs and use crates-index version of `git-repository`~~ The remaining 2 TODO's are more like questions for the reviewer.
* [x] run all tests with gitoxide on the fastest platform as another parallel task
* [x] switch to released version
* [x] [Tasks from first review round](#11448 (comment))
* [x] create a new `gitoxide` release and refer to the latest version from crates.io (instead of git-dependency)
* [x] [address 2nd review round comments](#11448 (comment))

### Postponed Tasks

I suggest to go breadth-first and implement the most valuable features first, and then aim for a broad replacement of `git2`. What's left is details and improved compatibility with the `git2` implementation that will be required once `gitoxide` should become the default implementation on stable to complete the transition.

* **built-in support for serving the `file` protocol** (i.e. without using `git`). Simple cases like `clone` can probably be supported quickly, `fetch` needs more work though due to negotiation.
* SSH name fallbacks via a native (probably ~~libssh~~ (avoid LGPL) `libssh2` based) transport. Look at [this issue](#2399) for some history.
* additional tasks from [this tracking issue](GitoxideLabs/gitoxide#450 (comment))

### Proposed Workflow

I am now using [stacked git](https://stacked-git.github.io) to keep commits meaningful during development. This will also mean that before reviews I will force-push a lot as changes will be bucketed into their respective commits.

Once review officially begins I will stop force-pushing and create small commits to address review comments. That way it should be easier to understand how things change over time.

Those review-comments can certainly be squashed into one commit before merging.

_Please let me know if this is feasible or if there are other ways of working you prefer._

### Development notes

* unrelated: [this line](https://github.com/rust-lang/cargo/blob/9827412fee4f5a88ac85e013edd954b2b63f399b/src/cargo/ops/registry.rs#L620) refers to an issue that has since been resolved in `curl`.
* Additional tasks related to a correct fetch implementation are collected in this [tracking issue](GitoxideLabs/gitoxide#450). **These affect how well the HTTP transport can be configured, needs work**
* _authentication_ [is quite complex](https://github.com/rust-lang/cargo/blob/37cad5bd7f7dcd2f6d3e45312a99a9d3eec1e2a0/src/cargo/sources/git/utils.rs#L490) and centred around making SSH connections work. This feature is currently the weakest in `gitoxide` as it simply uses `ssh` (the program) and calls it a day.  No authentication flows are supported there yet and the goal would be to match `git` there at least (which it might already do by just calling `ssh`). Needs investigation. Once en-par with `git` I think `cargo` can restart the whole fetch operation to try different user names like before.
   - the built-in `ssh`-program based transport can now understand permission-denied errors, but the capability isn't used after all since a builtin ssh transport is required.
* It would be possible to implement `git::Progress` and just ignore most of the calls, but that's known to be too slow as the implementation assumes a `Progress::inc()` call is as fast as an atomic increment and makes no attempt to reduce its calls to it.
* learning about [a way to get custom traits in `thiserror`](dtolnay/thiserror#212) could make spurious error checks nicer and less error prone during maintenance. It's not a problem though.
* I am using `RUSTFLAGS=--cfg` to influence the entire build and unit-tests as environment variables didn't get through to the binary built and run for tests.

### Questions

* The way `gitoxide` is configured the user has the opportunity to override these values using more specific git options, for example using url specific http settings. This looks like a feature to me, but if it's not `gitoxide` needs to provide a way to disable applying these overrides. Please let me know what's desired here - my preference is to allow overrides.
* `gitoxide` currently opens repositories similar to how `git` does which respects git specific environment variables. This might be a deviation from how it was before and can be turned off. My preference is to see it as a feature.

### Prerequisite PRs

* #11602
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: Awaiting review from the assignee but also interested parties.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants