Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement owned ops for HashSet and BTreeSet #109402

Closed
wants to merge 3 commits into from

Conversation

WaffleLapkin
Copy link
Member

This PR implements ops::{BitAnd, BitOr, BitXor, Sub} where one, or both arguments are owned (previously we only had impls for references to sets).

The main advantage of these impls is lifted T: Clone bounds (and, in case of HashSet, lifted S: Default). The secondary one is being able to reuse allocations. Lastly, they may (or may not) be more performant.

Added public APIs (insta-stable):

#![stable(feature = "set_owned_ops", since = "CURRENT_RUSTC_VERSION")]

impl<T: Eq + Hash, S: BuildHasher> BitOr<HashSet<T, S>> for HashSet<T, S>;
impl<T: Eq + Hash, S: BuildHasher> BitXor<HashSet<T, S>> for HashSet<T, S>;
impl<T: Eq + Hash, S: BuildHasher> BitAnd<&HashSet<T, S>> for HashSet<T, S>;
impl<T: Eq + Hash, S: BuildHasher> Sub<&HashSet<T, S>> for HashSet<T, S>;

impl<T: Ord, A: Allocator + Clone> BitOr<BTreeSet<T, A>> for BTreeSet<T, A>;
impl<T: Ord, A: Allocator + Clone> BitXor<BTreeSet<T, A>> for BTreeSet<T, A>;
impl<T: Ord, A: Allocator + Clone> BitAnd<&BTreeSet<T, A>> for BTreeSet<T, A>;
impl<T: Ord, A: Allocator + Clone> Sub<&BTreeSet<T, A>> for BTreeSet<T, A>;

// For all of the above, `type Output = Self;`

I only added the "important" impls, i.e. such that allow lifting the bounds. Thus union-like ops (or, xor) have both arguments owned (because they need elements from both sets), while difference-like ops (sub, and) have only self owned (that's sufficient to not clone elements).

Other potentially interesting implementations (in the order of most to least interesting according to myself):

  1. BitAnd<Set<T, ...>> for Set<T, ...> — optimization potential (can keep either allocation/iterate either set)
  2. BitOr<&Set<T, ...>> for Set<T, ...> and BitOr<&Set<T, ...>> for Set<T, ...> — still require T: Clone, but can keep the allocation/hasher
    • Symmetric BitOr<Set<T, ...>> for &Set<T, ...> and BitOr<Set<T, ...>> for &Set<T, ...> could be added for completeness
  3. BitAnd<Set<T, ...>> for &Set<T, ...> — symmetry for completeness
  4. Sub<Set<T, ...>> for Set<T, ...> and Sub<Set<T, ...>> for &Set<T, ...> — completeness (no bounds lifting/optimization gains)

r? libs-api

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Mar 20, 2023
@rustbot

This comment was marked as resolved.

@WaffleLapkin WaffleLapkin added A-collections Area: `std::collection` T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. and removed T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Mar 20, 2023
@workingjubilee workingjubilee added the needs-fcp This change is insta-stable, so needs a completed FCP to proceed. label Jul 31, 2023
@dtolnay dtolnay assigned dtolnay and unassigned joshtriplett Sep 17, 2023
@dtolnay
Copy link
Member

dtolnay commented Sep 17, 2023

@rust-lang/libs-api:
@rfcbot fcp merge

This adds the following binary operator impls shown in bold below. There is a pre-existing more restrictive impl corresponding to each one; the additional bounds required by the pre-existing impls are noted below.

All of the following require T: Eq + Hash, S: BuildHasher:

  • HashSet<T, S> | HashSet<T, S> (union)
    • existing impl: &HashSet<T, S> | &HashSet<T, S> requires T: Clone, S: Default
  • HashSet<T, S> ^ HashSet<T, S> (symmetric difference)
    • existing impl: &HashSet<T, S> ^ &HashSet<T, S> requires T: Clone, S: Default
  • HashSet<T, S> & &HashSet<T, S> (intersection)
    • existing impl: &HashSet<T, S> & &HashSet<T, S> requires T: Clone, S: Default
  • HashSet<T, S> - &HashSet<T, S> (difference)
    • existing impl: &HashSet<T, S> - &HashSet<T, S> requires T: Clone, S: Default

All of the following require T: Ord, A: Allocator + Clone:

  • BTreeSet<T, A> | BTreeSet<T, A> (union)
    • existing impl: &BTreeSet<T, A> | &BTreeSet<T, A> requires T: Clone
  • BTreeSet<T, A> ^ BTreeSet<T, A> (symmetric difference)
    • existing impl: &BTreeSet<T, A> ^ &BTreeSet<T, A> requires T: Clone
  • BTreeSet<T, A> & &BTreeSet<T, A> (intersection)
    • existing impl: &BTreeSet<T, A> & &BTreeSet<T, A> requires T: Clone
  • BTreeSet<T, A> - &BTreeSet<T, A> (difference)
    • existing impl: &BTreeSet<T, A> - &BTreeSet<T, A> requires T: Clone

@rfcbot
Copy link

rfcbot commented Sep 17, 2023

Team member @dtolnay has proposed to merge this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. labels Sep 17, 2023
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Sep 18, 2023
Add `minmax{,_by,_by_key}` functions to `core::cmp`

This PR adds the following functions:

```rust
// mod core::cmp
#![unstable(feature = "cmp_minmax")]

pub fn minmax<T>(v1: T, v2: T) -> [T; 2]
where
    T: Ord;

pub fn minmax_by<T, F>(v1: T, v2: T, compare: F) -> [T; 2]
where
    F: FnOnce(&T, &T) -> Ordering;

pub fn minmax_by_key<T, F, K>(v1: T, v2: T, mut f: F) -> [T; 2]
where
    F: FnMut(&T) -> K,
    K: Ord;
```
(they are also `const` under `#[feature(const_cmp)]`, I've omitted `const` stuff for simplicity/readability)

----

Semantically these functions are equivalent to `{ let mut arr = [v1, v2]; arr.sort(); arr }`, but since they operate on 2 elements only, they are implemented as a single comparison.

Even though that's basically a sort, I think "sort 2 elements" operation is useful on it's own in many cases. Namely, it's a common pattern when you have 2 things, and need to know which one is smaller/bigger to operate on them differently.

I've wanted such functions countless times, most recently in rust-lang#109402, so I thought I'd propose them.

----

r? libs-api
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request Sep 18, 2023
Rollup merge of rust-lang#109409 - WaffleLapkin:progamer, r=dtolnay

Add `minmax{,_by,_by_key}` functions to `core::cmp`

This PR adds the following functions:

```rust
// mod core::cmp
#![unstable(feature = "cmp_minmax")]

pub fn minmax<T>(v1: T, v2: T) -> [T; 2]
where
    T: Ord;

pub fn minmax_by<T, F>(v1: T, v2: T, compare: F) -> [T; 2]
where
    F: FnOnce(&T, &T) -> Ordering;

pub fn minmax_by_key<T, F, K>(v1: T, v2: T, mut f: F) -> [T; 2]
where
    F: FnMut(&T) -> K,
    K: Ord;
```
(they are also `const` under `#[feature(const_cmp)]`, I've omitted `const` stuff for simplicity/readability)

----

Semantically these functions are equivalent to `{ let mut arr = [v1, v2]; arr.sort(); arr }`, but since they operate on 2 elements only, they are implemented as a single comparison.

Even though that's basically a sort, I think "sort 2 elements" operation is useful on it's own in many cases. Namely, it's a common pattern when you have 2 things, and need to know which one is smaller/bigger to operate on them differently.

I've wanted such functions countless times, most recently in rust-lang#109402, so I thought I'd propose them.

----

r? libs-api
@jdahlstrom
Copy link

It's going to be a small but weird asymmetry and an ergonomics papercut if some operators can take owned arguments and others cannot. Adding all the combinations would also fix the issue that code like a & &b or &a & &b looks quite confusing.

@dtolnay dtolnay added S-waiting-on-fcp Status: PR is in FCP and is awaiting for FCP to complete. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Oct 15, 2023
@WaffleLapkin WaffleLapkin removed the needs-fcp This change is insta-stable, so needs a completed FCP to proceed. label Dec 12, 2023
@bors
Copy link
Contributor

bors commented Aug 5, 2024

☔ The latest upstream changes (presumably #128672) made this pull request unmergeable. Please resolve the merge conflicts.

@bors bors added the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Aug 5, 2024
@joshtriplett
Copy link
Member

joshtriplett commented Aug 6, 2024

Approving this, but also, please do update the PR to support BitAnd and Sub with both operands owned. Sub just for completeness; BitAnd because it can optimize by checking the sizes first.

@rfcbot rfcbot added the final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. label Aug 6, 2024
@rfcbot
Copy link

rfcbot commented Aug 6, 2024

🔔 This is now entering its final comment period, as per the review above. 🔔

@rfcbot rfcbot removed the proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. label Aug 6, 2024
@rfcbot rfcbot added finished-final-comment-period The final comment period is finished for this PR / Issue. to-announce Announce this issue on triage meeting and removed final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. labels Aug 16, 2024
@rfcbot
Copy link

rfcbot commented Aug 16, 2024

The final comment period, with a disposition to merge, as per the review above, is now complete.

As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.

This will be merged soon.

Copy link
Member

@dtolnay dtolnay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@dtolnay
Copy link
Member

dtolnay commented Aug 16, 2024

I rebased to fix a trivial conflict with #128309 in library/alloc/src/collections/btree/set.rs.

<<<<<<<
  // BTreeSet cursor stuff from #128309
|||||||
=======
  // fn minmax_by_key from #109402
>>>>>>>

@dtolnay
Copy link
Member

dtolnay commented Aug 16, 2024

Approving this, but also, please do update the PR to support BitAnd and Sub with both operands owned. Sub just for completeness; BitAnd because it can optimize by checking the sizes first.

I am on board with those new impls, but it'll need another FCP with the team so let's leave this to a followup PR.

@dtolnay
Copy link
Member

dtolnay commented Aug 16, 2024

@bors r+

@bors
Copy link
Contributor

bors commented Aug 16, 2024

📌 Commit 80b47a3 has been approved by dtolnay

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Aug 16, 2024
@rust-log-analyzer
Copy link
Collaborator

The job mingw-check failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)
#16 2.657 Building wheels for collected packages: reuse
#16 2.658   Building wheel for reuse (pyproject.toml): started
#16 2.903   Building wheel for reuse (pyproject.toml): finished with status 'done'
#16 2.904   Created wheel for reuse: filename=reuse-4.0.3-cp310-cp310-manylinux_2_35_x86_64.whl size=132715 sha256=dfa09868353292d98f811d3efdb0d54d07389e808efc71d68e3b93c514bf8bec
#16 2.904   Stored in directory: /tmp/pip-ephem-wheel-cache-izj4x9v2/wheels/3d/8d/0a/e0fc6aba4494b28a967ab5eaf951c121d9c677958714e34532
#16 2.907 Installing collected packages: boolean-py, binaryornot, tomlkit, reuse, python-debian, markupsafe, license-expression, jinja2, chardet, attrs
#16 3.300 Successfully installed attrs-23.2.0 binaryornot-0.4.4 boolean-py-4.0 chardet-5.2.0 jinja2-3.1.4 license-expression-30.3.0 markupsafe-2.1.5 python-debian-0.1.49 reuse-4.0.3 tomlkit-0.13.0
#16 3.301 WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
#16 DONE 3.4s
---
    Checking sharded-slab v0.1.7
error[E0308]: mismatched types
   --> /cargo/registry/src/index.crates.io-6f17d22bba15001f/ahash-0.8.11/src/hash_set.rs:134:31
    |
134 |         AHashSet(self.0.bitor(&rhs.0))
    |                         ----- ^^^^^^ expected `HashSet<T, S>`, found `&HashSet<T, S>`
    |                         arguments to this method are incorrect
    |
    = note: expected struct `std::collections::HashSet<_, _>`
            found reference `&std::collections::HashSet<_, _>`
            found reference `&std::collections::HashSet<_, _>`
help: the return type of this call is `&std::collections::HashSet<T, S>` due to the type of the argument passed
   --> /cargo/registry/src/index.crates.io-6f17d22bba15001f/ahash-0.8.11/src/hash_set.rs:134:18
    |
134 |         AHashSet(self.0.bitor(&rhs.0))
    |                               |
    |                               this argument influences the return type of `bitor`
note: method defined here
   --> /checkout/library/core/src/ops/bit.rs:261:8
   --> /checkout/library/core/src/ops/bit.rs:261:8
    |
261 |     fn bitor(self, rhs: Rhs) -> Self::Output;
help: consider removing the borrow
    |
    |
134 -         AHashSet(self.0.bitor(&rhs.0))
134 +         AHashSet(self.0.bitor(rhs.0))

error[E0308]: mismatched types
   --> /cargo/registry/src/index.crates.io-6f17d22bba15001f/ahash-0.8.11/src/hash_set.rs:198:32
    |
    |
198 |         AHashSet(self.0.bitxor(&rhs.0))
    |                         ------ ^^^^^^ expected `HashSet<T, S>`, found `&HashSet<T, S>`
    |                         arguments to this method are incorrect
    |
    = note: expected struct `std::collections::HashSet<_, _>`
            found reference `&std::collections::HashSet<_, _>`
            found reference `&std::collections::HashSet<_, _>`
help: the return type of this call is `&std::collections::HashSet<T, S>` due to the type of the argument passed
   --> /cargo/registry/src/index.crates.io-6f17d22bba15001f/ahash-0.8.11/src/hash_set.rs:198:18
    |
198 |         AHashSet(self.0.bitxor(&rhs.0))
    |                                |
    |                                this argument influences the return type of `bitxor`
note: method defined here
   --> /checkout/library/core/src/ops/bit.rs:361:8
   --> /checkout/library/core/src/ops/bit.rs:361:8
    |
361 |     fn bitxor(self, rhs: Rhs) -> Self::Output;
help: consider removing the borrow
    |
    |
198 -         AHashSet(self.0.bitxor(&rhs.0))
198 +         AHashSet(self.0.bitxor(rhs.0))

error[E0507]: cannot move out of `self` which is behind a shared reference
   --> /cargo/registry/src/index.crates.io-6f17d22bba15001f/ahash-0.8.11/src/hash_set.rs:166:18
    |
    |
166 |         AHashSet(self.0.bitand(&rhs.0))
    |                  ^^^^^^ -------------- `self.0` moved due to this method call
    |                  |
    |                  move occurs because `self.0` has type `std::collections::HashSet<T, S>`, which does not implement the `Copy` trait
    |
note: `bitand` takes ownership of the receiver `self`, which moves `self.0`
    |
    |
161 |     fn bitand(self, rhs: Rhs) -> Self::Output;
    |               ^^^^
help: you could `clone` the value and consume it, if the `S: Clone` trait bound could be satisfied
    |
166 |         AHashSet(<std::collections::HashSet<T, S> as Clone>::clone(&self.0).bitand(&rhs.0))

error[E0507]: cannot move out of `self` which is behind a shared reference
   --> /cargo/registry/src/index.crates.io-6f17d22bba15001f/ahash-0.8.11/src/hash_set.rs:230:18
    |
    |
230 |         AHashSet(self.0.sub(&rhs.0))
    |                  ^^^^^^ ----------- `self.0` moved due to this method call
    |                  |
    |                  move occurs because `self.0` has type `std::collections::HashSet<T, S>`, which does not implement the `Copy` trait
    |
note: `sub` takes ownership of the receiver `self`, which moves `self.0`
    |
    |
200 |     fn sub(self, rhs: Rhs) -> Self::Output;
    |            ^^^^
help: you could `clone` the value and consume it, if the `S: Clone` trait bound could be satisfied
    |
230 |         AHashSet(<std::collections::HashSet<T, S> as Clone>::clone(&self.0).sub(&rhs.0))

Some errors have detailed explanations: E0308, E0507.
For more information about an error, try `rustc --explain E0308`.
error: could not compile `ahash` (lib) due to 4 previous errors

@dtolnay
Copy link
Member

dtolnay commented Aug 16, 2024

@bors r-

@bors bors added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Aug 16, 2024
@dtolnay
Copy link
Member

dtolnay commented Aug 16, 2024

Method syntax like .bitand(...) used to perform an autoref, and doesn't anymore with the new impls.

https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=d18595251dafe2a1e7d74b56fb48300f

use std::collections::BTreeSet;
use std::ops::BitAnd;

fn f(a: BTreeSet<i32>, b: BTreeSet<i32>) {
    let a = (a,);
    let _ = a.0.bitand(&b);
    drop(a.0);
}
error[E0382]: use of moved value: `a.0`
   --> src/main.rs:7:10
    |
6   |     let _ = a.0.bitand(&b);
    |                 ---------- `a.0` moved due to this method call
7   |     drop(a.0);
    |          ^^^ value used here after move
    |
note: `bitand` takes ownership of the receiver `self`, which moves `a.0`
   --> /git/rust/library/core/src/ops/bit.rs:161:15
    |
161 |     fn bitand(self, rhs: Rhs) -> Self::Output;
    |               ^^^^
    = note: move occurs because `a.0` has type `BTreeSet<i32>`, which does not implement the `Copy` trait
help: you can `clone` the value and consume it, but this might not be your desired behavior
    |
6   |     let _ = a.0.clone().bitand(&b);
    |                ++++++++

@WaffleLapkin
Copy link
Member Author

Unfortunate that this happens, but ig there is nothing to do. Closing since this introduces a regression.

@WaffleLapkin WaffleLapkin deleted the set_owned_sub branch August 21, 2024 13:55
@apiraino apiraino removed the to-announce Announce this issue on triage meeting label Aug 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-collections Area: `std::collection` disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. finished-final-comment-period The final comment period is finished for this PR / Issue. S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. S-waiting-on-fcp Status: PR is in FCP and is awaiting for FCP to complete. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants