Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sort is incredibly slow #477

Closed
blackxored opened this issue Jan 19, 2018 · 9 comments
Closed

Sort is incredibly slow #477

blackxored opened this issue Jan 19, 2018 · 9 comments

Comments

@blackxored
Copy link

screenshot 2018-01-19 23 07 49

@davidchambers
Copy link
Member

Have you tried with type checking disabled?

> S.create({checkTypes: false, env: []}).sort([1, 4, 10, 8, 9, 10])
[1, 4, 8, 9, 10, 10]

I imagine this will make a big difference.

@CrossEye
Copy link

CrossEye commented Jan 20, 2018

Note that this comparison is not accurate. R.sort is not used that way. (The fact that those numbers show it twice as fast as the native sort, which doesn't have to reallocate anything, should be a giveaway.)


I was going to leave it at that, but I wanted to make sure the numbers seemed more reasonable. My tests show that Sanctuary is extremely slow, even without type-checking:

const R = require('ramda');
const S = require('sanctuary');

const COUNT = 1e4

const time = (msg, fn) => {
  let arr = [1, 4, 10, 8, 9, 10]
  const now = new Date();
  for (let i = 0; i < COUNT; i++) {
    fn(arr)
  }
  console.log(`${msg}: ${(new Date() - now) / COUNT}`)
}

time('Native', arr => arr.sort())
time('Ramda', arr => R.sort((a, b) => a - b, arr))
time('Sanctuary', arr => S.sort(arr))

const S2 = S.create({checkTypes: false, env: []})
time('Sanctuary w/o type-checking', arr => S2.sort(arr))
Native: 0.0007
Ramda: 0.0009
Sanctuary: 0.5779
Sanctuary w/o type-checking: 0.1062
  - Ubuntu 17.04, 16GB, Intel Core i7-4710MQ @ 2.50GHz 

Running it with 1e5 iterations is painful; I wouldn't even try 1e6.

Of course, Ramda is not trying to sort an arbitrary foldable monoid over an arbitrary ordered type. Nor is the native version. They content themselves with plain arrays. Sanctuary does much more. But this performance is scary: a tenth of a millisecond to sort a 6-element list, half of a millisecond with type-checking turned on.

Edit:

That timing function is wrong, but the results are similar if you change it to:

const time = (msg, fn) => {
  const now = new Date();
  for (let i = 0; i < COUNT; i++) {
    fn([1, 4, 10, 8, 9, 10])
  }
  console.log(`${msg}: ${(new Date() - now) / COUNT}`)
}

(obviously, in the native case, we shouldn't be re-sorting the resulting sorted array every time.)

@davidchambers
Copy link
Member

Thoughts on how to improve the performance of Z.sortBy would be much appreciated. @Avaq?

@Avaq
Copy link
Member

Avaq commented Jan 22, 2018

We can definitely look into improving its performance. One thing that comes to mind is implementing an array-specialised path. Let's get benchmarks sorted first though:

@paldepind
Copy link

paldepind commented Mar 11, 2018

The current algorithm is O(n^2) so that's probably why it's slow. I'm guessing performance could be very drastically improved by implementing it like:

  1. Convert foldable into array
  2. Sort array using Array#sort with a comparison function that uses < for primitives and ord for FL Ords
  3. Convert array back into foldable

If the argument to sort is already an array the last step could be skipped.

That is O(n*log(n)) and should be much faster. But, it will not be stable since Array#sort is not stable.

@paldepind
Copy link

The above could also be used to implement a stable sort. Then one would have to turn the Foldable into an array where each element is tagged with its original index. The comparison function should then break ties by using the index. That would probably make it a bit slower though.

@davidchambers
Copy link
Member

Are you interested in submitting a pull request, Simon?

@paldepind
Copy link

I'd love to 😄 Are there benchmarks I can run to compare the performance before and after?

I've implemented the above approach for List. The implementation can be seen here.

I've created a benchmark that measures the overall performance and roughly measures the cost of doing a stable sort vs. doing an unstable sort. With stability, the algorithm is only about 50% slower than Array#sort. Stability makes the sorting about 15% slower compared to the unstable version.

image

I personally think doing a stable sort is worth it. The performance difference isn't that huge and a stable sorts goes well with functional programming.

To get a bit extra performance I pull off a trick: I check the first element of the list. If it is a number or a string I do an unstable sort since it's impossible to tell if two identical numbers or strings gets exchanged. I'm not sure if there's any downsides to that.

@davidchambers
Copy link
Member

I'd love to

Wonderful!

Are there benchmarks I can run to compare the performance before and after?

https://github.com/sanctuary-js/sanctuary-type-classes/blob/master/bench/README.md

I'm sure @Avaq will be happy to answer questions if it's not clear how to run the benchmarks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants