Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

events: remove emit micro-optimizations #16869

Closed
wants to merge 3 commits into from

Conversation

apapirovski
Copy link
Member

With improvements in V8, using separate emit functions is no longer necessary. This is not applicable to v8.x or earlier (the earlier version that's baking is though).

improvement                             confidence        p.value
events/ee-emit.js n=2000000                 2.98 %        0.09852489
events/ee-emit-2-args.js n=2000000          4.19 %    *** 0.0001914216
events/ee-emit-6-args.js n=2000000         61.69 %    *** 6.611964e-35
events/ee-emit-diff-args.js n=2000000      -0.36 %        0.305069
events/ee-once.js n=20000000                6.42 %    *** 1.27831e-06

ee-emit-diff-args.js was a custom benchmark to confirm whether there was any deoptimization happening when the function pattern — number of arguments passed in — would change unpredictably (between 0 - 3 arguments, so what used to be the fast path). I also had a version of that benchmark with 0-4 arguments and that one was +20% or so, due to the improvements in emit performance for more than 3 arguments.

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • tests and/or benchmarks are included
  • commit message follows commit guidelines
Affected core subsystem(s)

benchmark, events

With improvements in V8, using separate emit functions is no longer
necessary and can instead be replaced by the spread operator.

improvement                             confidence        p.value
events/ee-emit.js n=2000000                 2.98 %        0.09852489
events/ee-emit-2-args.js n=2000000          4.19 %    *** 0.0001914216
events/ee-emit-6-args.js n=2000000         61.69 %    *** 6.611964e-35
events/ee-emit-diff-args.js n=2000000      -0.36 %        0.305069
events/ee-once.js n=20000000                6.42 %    *** 1.27831e-06
@apapirovski apapirovski added dont-land-on-v4.x events Issues and PRs related to the events subsystem / EventEmitter. performance Issues and PRs related to the performance of Node.js. labels Nov 7, 2017
@nodejs-github-bot nodejs-github-bot added the events Issues and PRs related to the events subsystem / EventEmitter. label Nov 7, 2017
@apapirovski
Copy link
Member Author

Copy link
Member

@addaleax addaleax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice! 👏

@apapirovski
Copy link
Member Author

apapirovski commented Nov 7, 2017

Benchmark CI: https://ci.nodejs.org/job/benchmark-node-micro-benchmarks/23/

Results

 events/ee-add-remove.js n=250000                        -1.25 %            5.003391e-02
 events/ee-emit-2-args.js n=2000000                       3.25 %        *** 3.664056e-13
 events/ee-emit-4-args.js n=2000000                      45.91 %        *** 1.096930e-59
 events/ee-emit.js n=2000000                              1.66 %        *** 8.587999e-04
 events/ee-listener-count-on-prototype.js n=50000000     -0.14 %            9.623456e-01
 events/ee-listeners.js n=5000000                        -0.93 %            1.751751e-01
 events/ee-listeners-many.js n=5000000                    0.10 %            9.296697e-01
 events/ee-once.js n=20000000                             4.57 %         ** 2.418052e-03

@mscdex
Copy link
Contributor

mscdex commented Nov 7, 2017

We should just have one ee-emit.js benchmark which has a parameter for number of args, something like 0-6 maybe.

@mscdex
Copy link
Contributor

mscdex commented Nov 7, 2017

Also, we should probably make the number of listeners configurable as well, so that we can make sure there are no regressions for smaller numbers of listeners (e.g. 2-9).

@apapirovski
Copy link
Member Author

Is the AIX failure related? Anyone know? Doesn't seem like it but...

@lpinca
Copy link
Member

lpinca commented Nov 7, 2017

It is failing since yesterday, not related.


const ee = new EventEmitter();

for (var k = 0; k < 10; k += 1)
Copy link
Contributor

@mathiasbynens mathiasbynens Nov 7, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coding style question: why not let? (Not just here but throughout the patch)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its much slower because it creates a closure for the variable in each loop iteration

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mathiasbynens We have a lint rule re: this so that's the main reason. I think @TimothyGu tried to change it recently and feedback from @bmeurer was that we shouldn't quite yet.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's the relevant PR with more conversation: #15648

handler.apply(this, args);
} else {
const len = handler.length;
const listeners = arrayClone(handler, len);
Copy link
Member

@bmeurer bmeurer Nov 8, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you leave a TODO here to consider switching to Array.prototype.slice once V8 6.4 lands in Node? Or even to evaluate the idea of avoiding the defensive copy on emit and rather making sure that the handler itself is never mutated?

Copy link
Member

@bmeurer bmeurer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM modulo comment.

Seeing this happening makes me happy!

Copy link
Contributor

@mscdex mscdex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making my concerns more explicit ...

We should just have one ee-emit.js benchmark which has a parameter for number of args, something like 0-6 maybe.

Also, we should probably make the number of listeners configurable as well, so that we can make sure there are no regressions for smaller numbers of listeners (e.g. 2-9).

@apapirovski
Copy link
Member Author

apapirovski commented Nov 8, 2017

New benchmark CI for the changes: https://ci.nodejs.org/view/Node.js%20benchmark/job/benchmark-node-micro-benchmarks/28/

events/ee-add-remove.js n=250000                        -0.51 %            3.833557e-01
 events/ee-emit-2-args.js n=2000000                       3.40 %         ** 3.686601e-03
 events/ee-emit-4-args.js n=2000000                      45.03 %        *** 3.944833e-67
 events/ee-emit.js listeners=10 argc=0 n=2000000         -4.79 %        *** 1.212118e-18
 events/ee-emit.js listeners=10 argc=10 n=2000000        13.82 %        *** 3.714492e-15
 events/ee-emit.js listeners=10 argc=2 n=2000000        -15.31 %        *** 8.885326e-36
 events/ee-emit.js listeners=10 argc=4 n=2000000         14.30 %        *** 6.484010e-11
 events/ee-emit.js listeners=1 argc=0 n=2000000          -0.04 %            9.825977e-01
 events/ee-emit.js listeners=1 argc=10 n=2000000         21.72 %        *** 5.891698e-07
 events/ee-emit.js listeners=1 argc=2 n=2000000          -0.75 %            4.022986e-01
 events/ee-emit.js listeners=1 argc=4 n=2000000          26.56 %        *** 8.406902e-29
 events/ee-emit.js listeners=5 argc=0 n=2000000          -3.05 %        *** 2.006354e-06
 events/ee-emit.js listeners=5 argc=10 n=2000000         32.30 %        *** 1.157809e-12
 events/ee-emit.js listeners=5 argc=2 n=2000000          -5.86 %        *** 2.034290e-22
 events/ee-emit.js listeners=5 argc=4 n=2000000          22.25 %        *** 5.349469e-39
 events/ee-listener-count-on-prototype.js n=50000000     -0.27 %            8.455537e-01
 events/ee-listeners.js n=5000000                         0.81 %            1.344333e-01
 events/ee-listeners-many.js n=5000000                    3.61 %          * 4.947461e-02
 events/ee-once.js n=20000000                             6.30 %        *** 4.165866e-05

I have absolutely no clue why there's exactly one benchmark that tanked on the new version:

 events/ee-emit.js listeners=10 argc=2 n=2000000        -15.31 %        *** 8.885326e-36

As far as I can tell this is almost equivalent to the old ee-emit-multi-args.js.

@apapirovski
Copy link
Member Author

Ok, that benchmark needs to be tweaked. There's an optimization happening when using apply or bind that doesn't when one just makes a regular emit call. Working on v2.

@apapirovski
Copy link
Member Author

apapirovski commented Nov 8, 2017

Ok, created a second version of the benchmark that doesn't use apply or bind (I assume that using those let V8 optimize for the exact number of arguments being passed which favoured the previous version of the code). The new benchmark just plainly calls emit like before.

Benchmark CI: https://ci.nodejs.org/view/Node.js%20benchmark/job/benchmark-node-micro-benchmarks/31/

(The reason I changed it is that using apply or bind with emit is not something we do in our code. It doesn't seem worth it to test for a scenario that doesn't exist.)

@apapirovski apapirovski force-pushed the patch-events-emit-perf branch from 210602d to 7ddf389 Compare November 8, 2017 18:45
@apapirovski
Copy link
Member Author

New benchmark results:

 events/ee-emit.js listeners=10 argc=0 n=2000000      -3.54 %        *** 6.874300e-06
 events/ee-emit.js listeners=10 argc=10 n=2000000     57.01 %        *** 5.922333e-68
 events/ee-emit.js listeners=10 argc=2 n=2000000       1.52 %            8.892891e-02
 events/ee-emit.js listeners=10 argc=4 n=2000000      45.02 %        *** 1.893663e-72
 events/ee-emit.js listeners=1 argc=0 n=2000000       -0.18 %            9.229945e-01
 events/ee-emit.js listeners=1 argc=10 n=2000000      51.86 %        *** 4.830343e-21
 events/ee-emit.js listeners=1 argc=2 n=2000000        4.25 %        *** 2.762911e-06
 events/ee-emit.js listeners=1 argc=4 n=2000000       41.83 %        *** 2.613486e-39
 events/ee-emit.js listeners=5 argc=0 n=2000000       -0.72 %            2.251744e-01
 events/ee-emit.js listeners=5 argc=10 n=2000000      59.05 %        *** 6.230044e-24
 events/ee-emit.js listeners=5 argc=2 n=2000000        2.09 %         ** 1.569820e-03
 events/ee-emit.js listeners=5 argc=4 n=2000000       34.62 %        *** 1.561699e-50

@apapirovski
Copy link
Member Author

@mscdex PTAL

@apapirovski
Copy link
Member Author

apapirovski commented Nov 13, 2017

@mscdex or anyone else that reviewed this, PTAL. The "changes requested" needs to be dismissed before we can land this.

Copy link
Member

@jasnell jasnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still LGTM

@mscdex
Copy link
Contributor

mscdex commented Nov 13, 2017

We can make things faster for multiple listeners by adding some fast paths in arrayClone(). For example, this version of arrayClone() gives noticeable speed improvements for all array sizes:

function arrayClone(arr, n) {
  switch (n) {
    case 0: return [];
    case 1: return [arr[0]];
    case 2: return [arr[0], arr[1]];
    case 3: return [arr[0], arr[1], arr[2]];
    case 4: return [arr[0], arr[1], arr[2], arr[3]];
    case 5: return [arr[0], arr[1], arr[2], arr[3], arr[4]];
  }
  // Not included in 'default' case because of perf issue with `const` in
  // a switch case
  const copy = new Array(n);
  copy[0] = arr[0];
  copy[1] = arr[1];
  copy[2] = arr[2];
  copy[3] = arr[3];
  copy[4] = arr[4];
  copy[5] = arr[5];
  for (var i = 6; i < n; ++i)
    copy[i] = arr[i];
  return copy;
}

I chose 5 as the max array length for fast paths as that seems like a reasonable limit to me.

Also, 0 and 1-length arrays are supported for backwards compatibility, otherwise we could remove those cases since those kinds of arrays will not be generated internally (I'm thinking about modules that may directly mutate arrays in _events).

@TimothyGu
Copy link
Member

TimothyGu commented Nov 13, 2017

@mscdex Isn't that a bit outside the scope of this PR specifically, considering this PR does not touch that piece of code?

@mscdex
Copy link
Contributor

mscdex commented Nov 13, 2017

@TimothyGu No? To me this PR is about improving emit() performance (judging by the previously posted benchmark results). I just thought incorporating my suggestion would make those improvements more substantial since emit() is the sole user of arrayClone()...

@Trott
Copy link
Member

Trott commented Nov 14, 2017

To me this PR is about improving emit() performance

@mscdex I guess @apapirovski can say for sure. Personally, I thought it was about making the code more maintainable/understandable by removing CrankshaftScript optimizations that no longer provide significant benefit. The fact that it also improves (rather than merely preserves) performance may have been incidental.

I also prefer PRs to remain narrowly scoped on principle. If we put the arrayClone() suggestion into a separate PR, it means we can land it in older versions if it helps there too without having to land the stuff in this PR at the same time. Or if it causes unexpected problems, it can be backed out without having to back out these other changes. (I guess that's an argument for separate commits more than separate PRs. Still, putting it in a separate PR makes the code review more manageable, as well as following the conversation relevant to the code change.)

@mscdex
Copy link
Contributor

mscdex commented Nov 14, 2017

If we put the arrayClone() suggestion into a separate PR, it means we can land it in older versions if it helps there too without having to land the stuff in this PR at the same time

From what I've seen, benchmarks are often not checked when backporting, which is especially important when going from TurboFan to Crankshaft.

Whatever though, I just thought I'd throw the perf improvement out there for anyone interested.

@jasnell
Copy link
Member

jasnell commented Nov 14, 2017

I see no reason not to add @mscdex's suggestion as a second separate commit in this PR.

@bmeurer
Copy link
Member

bmeurer commented Nov 14, 2017

@mscdex These arrayClone optimizations don't look really appealing to me. How about avoiding the defensive copy upon emission completely and instead copy on mutation? Emission seems to be the more common case.

@mscdex
Copy link
Contributor

mscdex commented Nov 14, 2017

@bmeurer I'm not sure what you're asking, but we've always made a copy up front in case the event handlers change during the execution of event handlers.

@bmeurer
Copy link
Member

bmeurer commented Nov 14, 2017

Right, but wouldn't it be possible to treat the handlers array as copy on write? Such that when iterating over them, you sort of take ownership of the handlers array and in case someone adds/removes handlers, you create a new Array instead of mutating the existing one.

@TimothyGu
Copy link
Member

@bmeurer How would do you get notified when the array changes?

@Trott
Copy link
Member

Trott commented Nov 14, 2017

I could be wrong, but it sure looks like we may be spiraling off into a conversation about whether and how to implement the arrrayClone() part. So I'm going to repeat my suggestion that it be a separate PR so that conversation can happen separately and not stall the pretty-sure-we-all-agree-it's-a-good-thing-and-should-land changes that @apapirovski has proposed here already.

@Trott
Copy link
Member

Trott commented Nov 14, 2017

By the way, the only thing preventing this from landing at this point is the objection from @mscdex. So I guess the question is whether the objection has been effectively cleared by the benchmark fixes/changes that @apapirovski did in response? Or is there still an objection to landing this without the arrayClone() stuff as well?

@apapirovski
Copy link
Member Author

apapirovski commented Nov 14, 2017

Sorry, I haven't had time the past couple of days to address much of what's been discussed here. But as mentioned, the intent was mostly to remove "ugly" code that wasn't improving performance any longer. The speed up for emits with many arguments is a just a nice boon. (Further evidenced by the quote in the original: "With improvements in V8, using separate emit functions is no longer necessary. ")

Re: the ensuing discussion, I think @bmeurer has a valid point re: allocating new handler array when an event is attached as opposed to emitted. That's something that could be benchmarked and tested. I might look at it if I have time this week.

In the meantime, I'll be landing this shortly.

@apapirovski
Copy link
Member Author

Landed in f44f18a

@apapirovski apapirovski deleted the patch-events-emit-perf branch November 14, 2017 19:04
apapirovski added a commit that referenced this pull request Nov 14, 2017
With improvements in V8, using separate emit functions is no longer
necessary and can instead be replaced by the spread operator.

improvement                             confidence        p.value
events/ee-emit.js n=2000000                 2.98 %        0.09852489
events/ee-emit-2-args.js n=2000000          4.19 %    *** 0.0001914216
events/ee-emit-6-args.js n=2000000         61.69 %    *** 6.611964e-35
events/ee-emit-diff-args.js n=2000000      -0.36 %        0.305069
events/ee-once.js n=20000000                6.42 %    *** 1.27831e-06

PR-URL: #16869
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: Colin Ihrig <[email protected]>
Reviewed-By: Refael Ackermann <[email protected]>
Reviewed-By: Evan Lucas <[email protected]>
Reviewed-By: Bryan English <[email protected]>
Reviewed-By: Luigi Pinca <[email protected]>
Reviewed-By: Timothy Gu <[email protected]>
Reviewed-By: Franziska Hinkelmann <[email protected]>
Reviewed-By: Benedikt Meurer <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Brian White <[email protected]>
evanlucas pushed a commit that referenced this pull request Nov 14, 2017
With improvements in V8, using separate emit functions is no longer
necessary and can instead be replaced by the spread operator.

improvement                             confidence        p.value
events/ee-emit.js n=2000000                 2.98 %        0.09852489
events/ee-emit-2-args.js n=2000000          4.19 %    *** 0.0001914216
events/ee-emit-6-args.js n=2000000         61.69 %    *** 6.611964e-35
events/ee-emit-diff-args.js n=2000000      -0.36 %        0.305069
events/ee-once.js n=20000000                6.42 %    *** 1.27831e-06

PR-URL: #16869
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: Colin Ihrig <[email protected]>
Reviewed-By: Refael Ackermann <[email protected]>
Reviewed-By: Evan Lucas <[email protected]>
Reviewed-By: Bryan English <[email protected]>
Reviewed-By: Luigi Pinca <[email protected]>
Reviewed-By: Timothy Gu <[email protected]>
Reviewed-By: Franziska Hinkelmann <[email protected]>
Reviewed-By: Benedikt Meurer <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Brian White <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
events Issues and PRs related to the events subsystem / EventEmitter. performance Issues and PRs related to the performance of Node.js.
Projects
None yet
Development

Successfully merging this pull request may close these issues.