Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

collections: Make BinaryHeap panic safe in sift_up / sift_down #25856

Merged
merged 1 commit into from
May 28, 2015

Conversation

bluss
Copy link
Member

@bluss bluss commented May 28, 2015

collections: Make BinaryHeap panic safe in sift_up / sift_down

Use a struct called Hole that keeps track of an invalid location
in the vector and fills the hole on drop.

I include a run-pass test that the current BinaryHeap fails, and the new
one passes.

NOTE: The BinaryHeap will still be inconsistent after a comparison fails. It will
not have the heap property. What we fix is just that elements will be valid
values.

This is actually a performance win -- the new code does not bother to write in zeroed()
values in the holes, it just leaves them as they were.

Net result is something like a 5% decrease in runtime for BinaryHeap::from_vec. This
can be further improved by using unchecked indexing (I confirmed it makes a difference,
not a surprise with the non-sequential access going on), but let's leave that for another PR.
Safety first 😉

Fixes #25842

@rust-highfive
Copy link
Collaborator

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @brson (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. The way Github handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

@bluss
Copy link
Member Author

bluss commented May 28, 2015

I prototyped this using a general scope guard, but this Hole representation turned out to be nice.

LLVM needs to be on our side here and see that the removed element's Option<T>
is always Some, etc. If you wonder why I use hole.pos() in one location and hole.pos
in another, it's because those choices benched the best, both locations. Optimizing
compilers, they are chaotic.

Edited: Moved useful info from this comment into the merge message itself (above)

cc @alexcrichton @gankro

@bluss bluss force-pushed the binary-heap-hole branch from 5f44f5b to b221853 Compare May 28, 2015 12:14
@bluss
Copy link
Member Author

bluss commented May 28, 2015

Using only unchecked indexing in Hole removes the bench result dependence on hole.pos vs hole.pos() -- just shows how panicking can be an impediment to optimization.


impl<'a, T> Hole<'a, T> {
/// Create a new Hole at index `pos`.
pub fn new(data: &'a mut [T], pos: usize) -> Self {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could the pub be dropped from these functions? (Hole isn't exposed so they shouldn't need to be exposed either)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes the code more portable, though. e.g. we can happily move this to a module without worrying.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It came out with pub, I think pub is natural as soon as the methods are intended to be called from outside the struct itself. I know our privacy doesn't work that way, but it does once the code grows and yes you move it out to a module. I'm fine either way but I prefer pub.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idiomatically code in the rest of the standard library does not do this, so let's stick to existing conventions. We can add pub if necessary at a later date, but code should be as conservative as possible in exports today.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok that's fine

@alexcrichton
Copy link
Member

Looks good to me, thanks @bluss! Perhaps the test could also be modified to ensure that the destructor for each element in the heap isn't run more than once?

Other than that though r=me, we can always tweak the performance at a later date.

cc @gankro

@alexcrichton alexcrichton added the beta-nominated Nominated for backporting to the compiler in the beta channel. label May 28, 2015
@bluss
Copy link
Member Author

bluss commented May 28, 2015

Thanks. I'll add a test that makes sure no destructors were called right after catching the panic, before I inspect the data further. That should cover it (and fails on old version of BinaryHeap).

ptr::write(&mut self.data[pos], x);
pos = parent;
while hole.pos() > start {
let parent = (hole.pos() - 1) >> 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Surely LLVM can convert the much more semantically clear /2 to this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll fix it. Haven't really looked at changing anything besides what's needed, though.

@Gankra
Copy link
Contributor

Gankra commented May 28, 2015

This looks great!

@bluss bluss force-pushed the binary-heap-hole branch from b221853 to c41a5cc Compare May 28, 2015 18:03
@bluss
Copy link
Member Author

bluss commented May 28, 2015

PR updated. Everything is addressed, removed pub. The test checks for number of drop calls now (old binary heap used to drop 1 on panic), but it could still have tracked drops in even more detail.

Use a struct called Hole that keeps track of an invalid location
in the vector and fills the hole on drop.

I include a run-pass test that the current BinaryHeap fails, and the new
one passes.

Fixes rust-lang#25842
@bluss bluss force-pushed the binary-heap-hole branch from c41a5cc to 5249cbb Compare May 28, 2015 18:24
@bluss
Copy link
Member Author

bluss commented May 28, 2015

wait, oh I totally thought I had spotted a bug and wondered how that could have slipped my tests, but no, it's fine. Pushed an update with one scope less that I didn't need, so the diff for sift_down is easier to read.

@Gankra
Copy link
Contributor

Gankra commented May 28, 2015

@bors r+

Sweet!

@bors
Copy link
Contributor

bors commented May 28, 2015

📌 Commit 5249cbb has been approved by Gankro

@alexcrichton alexcrichton added the T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. label May 28, 2015
@bors
Copy link
Contributor

bors commented May 28, 2015

⌛ Testing commit 5249cbb with merge efebe45...

bors added a commit that referenced this pull request May 28, 2015
collections: Make BinaryHeap panic safe in sift_up / sift_down

Use a struct called Hole that keeps track of an invalid location
in the vector and fills the hole on drop.

I include a run-pass test that the current BinaryHeap fails, and the new
one passes.

NOTE: The BinaryHeap will still be inconsistent after a comparison fails. It will
not have the heap property. What we fix is just that elements will be valid
values.

This is actually a performance win -- the new code does not bother to write in `zeroed()`
values in the holes, it just leaves them as they were.

Net result is something like a 5% decrease in runtime for `BinaryHeap::from_vec`. This
can be further improved by using unchecked indexing (I confirmed it makes a difference,
not a surprise with the non-sequential access going on), but let's leave that for another PR.
Safety first 😉 

Fixes #25842
@bors bors merged commit 5249cbb into rust-lang:master May 28, 2015
@bluss bluss deleted the binary-heap-hole branch May 28, 2015 22:24
@huonw
Copy link
Member

huonw commented May 29, 2015

Nice work @bluss!

@alexcrichton
Copy link
Member

triage: beta-accepted

@alexcrichton alexcrichton added the beta-accepted Accepted for backporting to the compiler in the beta channel. label Jun 9, 2015
@alexcrichton alexcrichton removed the beta-nominated Nominated for backporting to the compiler in the beta channel. label Jun 11, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
beta-accepted Accepted for backporting to the compiler in the beta channel. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BinaryHeap is not exception safe
7 participants