Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing Issue 152 #163

Merged
merged 10 commits into from
Nov 22, 2016
Merged

Conversation

ChristopherDavenport
Copy link
Contributor

@ChristopherDavenport ChristopherDavenport commented Nov 19, 2016

Working as group which is why the pull request is being initiated early so that we can each merge into this common branch.

First Commit Implements cmp
Uses the existing zip, map, and reduce combinators to implement cmp.

Closes #152

@nikomatsakis
Copy link
Member

Looks good so far: one thing that's missing is tests =)

@ChristopherDavenport
Copy link
Contributor Author

This is a basic implementation where performance can be improved through early returns, however this should effectively satisfy the contracts.

Copy link
Member

@nikomatsakis nikomatsakis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! It's close but I think there are a few things to fix; I left some comments.

.map(|(x, y)| Ord::cmp(&x, &y))
.reduce(|| Ordering::Equal,
|cmp_a, cmp_b| if cmp_a != Ordering::Equal {
cmp_b
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is backwards. You want if cmp_a == Ordering::Equal { cmp_ b } else { cmp_a }.

self.zip(other.into_par_iter())
.map(|(x, y)| PartialOrd::partial_cmp(&x, &y))
.reduce(|| Some(Ordering::Equal),
|cmp_a, cmp_b| match (cmp_a, cmp_b) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also not quite right. I think you want something like this:

match cmp_a {
    Some(Ordering::Equal) => cmp_b,
    Some(_) => cmp_a,
    None => None,
}

The idea is that once we find something which is ordered (and not equal), it doesn't matter what comes after. But if we find something unordered, same is true.

Here is an example demonstrating this: http://play.integer32.com/?gist=87daf4489fb2ca744e1ea99abac0f443&version=stable


#[test]
pub fn check_eq() {
let a: Vec<usize> = (0..1024).collect();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests are actually not testing your code. You are building vectors and then comparing them sequentially. You need something like:

let a = (0..1024).into_par_iter(); // NB: not collect() :)
let b = (0..1024).into_par_iter();
let result = a.eq(b);
assert!(result);

However, my preference is to write the tests so that they compare par against seq iterators, rather than directly encoding the expected result:

assert_eq!((0..1024).into_par_iter().eq(0..1024), (0..1024).eq(0..1024));

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add tests both between pars and between par and seq iterators. That way you'll have tests for expected behavior internally as well as expected behavior as compared to the external standard of seq iterators.


assert!(result_gt);
assert!(result_eq);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you also add some tests involving floats (which are not fully ordered)?

http://play.integer32.com/?gist=87daf4489fb2ca744e1ea99abac0f443&version=stable would be a good one, if we added the parallel version and compared them against one another.

It'd also be good to add a test with some randomly generated data, using a fixed seed.

Something like this:

#[test]
pub fn cmp_and_partial_cmp() {
    use rand::{Rng, SeedableRng, XorShiftRng};

    let mut rng = XorShiftRng::from_seed([14159, 26535, 89793, 23846]);
    let a: Vec<i32> = rng.gen_iter().take(1024).collect();
    let b: Vec<i32> = rng.gen_iter().take(1024).collect();
    for i in 0..a.len() {
        assert_eq!(a[i..].par_iter().partial_cmp(b[i..]));
        assert_eq!(a[i..].iter().cmp(b[i..]);
    }
}

(a twist on @utkarshkukreti's test)

@ChristopherDavenport
Copy link
Contributor Author

I'll get on it right away. I know the rest of the crew that was working on this with me went onto vacation, so I'll see what I can get under control.

As an aside, thank you for the really useful feedback. I'm fairly new to rust and such constructive criticism is very rare and very appreciated as I learned a lot from your change suggestions.

@ChristopherDavenport
Copy link
Contributor Author

  1. Fixed cmp, it was indeed backwards.
  2. For partial_cmp the much simpler match semantics play out much easier and return correctly if an earlier value is greater.
  3. All tests have been switched to into_par_iter or par_iter where appropriate. Tests check for appropriate evaluation as set by this library, and with comparison to seq iterators, therefore if seq iterator implementation in some way changes this will be apparent, as local tests will pass but comparison tests will fail.
  4. Float vectors were analyzed like in the simple example suggested pulling both with an early NaN returning None, and after a difference not making a difference to the return. Randomly generated data using a fixed seed was implemented for both cmp and partial_cmp with a comparison to the sequential version showing that the comparisons are indeed equal.

If there is anything more I need to do just let me know.

Copy link
Member

@nikomatsakis nikomatsakis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

let a: Vec<i32> = rng.gen_iter().take(1024).collect();
let b: Vec<i32> = rng.gen_iter().take(1024).collect();
for i in 0..a.len() {
let par_result = a[i..].par_iter().partial_cmp(b[i..].par_iter());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just FYI, the second call to par_iter (and iter) is not necessary: a[i..].par_iter().partial_cmp(&b[i..]) is equivalent. But this is also fine, no need to change it -- more explicit this way.

@nikomatsakis
Copy link
Member

I see you have some merge conflicts -- do you think you can rebase? I suspect part of the problem is that you seem to have run rustfmt, which caused a lot of miscellaneous unrelated diffs in the code. (Though I don't mind, since I've been wanting to run rustfmt regularly).

One thing you might try if you are going to rebase is to introduce a first commit where you run rustfmt -- before you make any of our changes. That will isolate the rustfmt diffs into one commit and also make merging/rebasing easier.

Let me know though: if you are not that familiar with git, or don't have time, I can also help out and merge the changes.

@ChristopherDavenport
Copy link
Contributor Author

I will run rustfmt on these files which has no issue and then issue a rebase of an intermediate commit.

I can do one across the whole project but it seems to create an oddity in the implementation of Tree in scope/test.rs. So rather than include it as the baseline after I finish the changes I can issue a pr with the whole project run through the rustfmt blender, that way the commits have good separation of concerns.

ChristopherDavenport and others added 10 commits November 21, 2016 13:34
Uses the existing zip, map, and reduce combinators to implement cmp.
Uses the existing zip, map, and reduce combinators to implement partial_cmp.
Ran cmp on usize for equal, lessThan and greaterThan
partial_cmp for equal, gt, lt, and NAN
Added functions eq, ne, lt, le, gt, ge
Tests for eq,ne,lt,le,gt,ge
cmp had an inversed if statement that neeeded to be rectified.

Tests were modified to actually use to par_iter cmp both directly between two par_iters
and between a par_iter and a sequential
Added better implementation of partial_cmp which returns correctly on first difference.

Added direct and to_seq comparison tests of the new implementation including those for par iterators with NAN.
Generates a random, but set via a known seed Vector which is used to compare cmp, and partial_cmp of both par and seq and ensure the results are the same for each step of the the comparison.
@nikomatsakis
Copy link
Member

Thanks!

@ChristopherDavenport ChristopherDavenport deleted the issue-152 branch May 1, 2017 17:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants