Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

process: faster next tick #18617

Closed
wants to merge 2 commits into from
Closed

Conversation

mafintosh
Copy link
Member

@mafintosh mafintosh commented Feb 7, 2018

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • tests and/or benchmarks are included
  • documentation is changed or added
  • commit message follows commit guidelines
Affected core subsystem(s)

process

Moves the nextTick implementation to a series of fixed size circular arrays instead of the linked list. The size for each array is fixed at 2048, which seems to provide the best perf.

Since streams are big nextTick users these days, I added a pipe benchmark also to see the impact.

NextTick benchmark
node benchmark/compare.js --old ./node-master --new ./out/Release/node --filter next-tick process > next-tick.csv
cat next-tick.csv | Rscript benchmark/compare.R
                                              confidence improvement accuracy (*)   (**)  (***)
 process/next-tick-breadth-args.js millions=4        ***     40.11 %       ±1.23% ±1.64% ±2.14%
 process/next-tick-breadth.js millions=4             ***      7.16 %       ±3.50% ±4.67% ±6.11%
 process/next-tick-depth-args.js millions=12         ***      5.46 %       ±0.91% ±1.22% ±1.59%
 process/next-tick-depth.js millions=12              ***     23.26 %       ±2.51% ±3.36% ±4.40%
 process/next-tick-exec-args.js millions=5           ***     38.64 %       ±1.16% ±1.55% ±2.01%
 process/next-tick-exec.js millions=5                ***     77.20 %       ±1.63% ±2.18% ±2.88%

Be aware that when doing many comparisions the risk of a false-positive
result increases. In this case there are 6 comparisions, you can thus
expect the following amount of false-positive results:
  0.30 false positives, when considering a   5% risk acceptance (*, **, ***),
  0.06 false positives, when considering a   1% risk acceptance (**, ***),
  0.01 false positives, when considering a 0.1% risk acceptance (***)
Streams benchmark
node benchmark/compare.js --old ./node-master --new ./out/Release/node stream > stream.csv
cat stream.csv | Rscript benchmark/compare.R
                                                       confidence improvement accuracy (*)   (**)  (***)
 streams/pipe-object-mode.js n=5000000                        ***     21.85 %       ±0.76% ±1.01% ±1.32%
 streams/pipe.js n=5000000                                    ***     20.30 %       ±0.59% ±0.79% ±1.02%
 streams/readable-bigread.js n=1000                                    0.00 %       ±0.67% ±0.89% ±1.16%
 streams/readable-bigunevenread.js n=1000                             -0.10 %       ±0.69% ±0.92% ±1.20%
 streams/readable-boundaryread.js type="buffer" n=2000                -0.71 %       ±0.87% ±1.15% ±1.50%
 streams/readable-boundaryread.js type="string" n=2000                 0.04 %       ±0.91% ±1.21% ±1.58%
 streams/readable-readall.js n=5000                                   -0.21 %       ±1.86% ±2.48% ±3.22%
 streams/readable-unevenread.js n=1000                                -0.18 %       ±0.47% ±0.63% ±0.81%
 streams/transform-creation.js n=1000000                               0.21 %       ±0.59% ±0.79% ±1.02%
 streams/writable-manywrites.js n=2000000                     ***     55.44 %       ±2.41% ±3.21% ±4.20%

Be aware that when doing many comparisions the risk of a false-positive
result increases. In this case there are 10 comparisions, you can thus
expect the following amount of false-positive results:
  0.50 false positives, when considering a   5% risk acceptance (*, **, ***),
  0.10 false positives, when considering a   1% risk acceptance (**, ***),
  0.01 false positives, when considering a 0.1% risk acceptance (***)

@nodejs-github-bot nodejs-github-bot added the process Issues and PRs related to the process subsystem. label Feb 7, 2018
@hiroppy hiroppy added the benchmark Issues and PRs related to the benchmark subsystem. label Feb 7, 2018
@BridgeAR BridgeAR added the performance Issues and PRs related to the performance of Node.js. label Feb 7, 2018
@mafintosh
Copy link
Member Author

The next-tick-breadth.js seem to be a bit slower cause it's hitting inserting more than 1024 elements so hitting the slow path of this impl. The same goes for the readable-readall bench, since that for some reason queues a lot of nextTicks

@mafintosh
Copy link
Member Author

Gonna do some investigation to see if I can fix those

head = new SmallQueue();
}

if (head.top === 0 && tail === null) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tail === null check is already in the above one. So it could also be written as:

if (tail === null) {
  if (head.top => 1024 && head.btm > 0) {
    // ...
  } else if (head.top === 0) {
    tickInfo...
  }
}


if (tail !== null) {
next = tail.shift();
if (next !== null) return next;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I am not mistaken this if (next !== null) check could be removed by moving the if (this.btm < this.top) { check from SmallQueue#shift() in here and to always return a value when calling SmallQueue#shift().

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (tail !== null) {
  if (tail.btm < tail.top) return tail.shift();
  tail = null;
}


shift() {
if (this.btm < this.top) {
var next = this.list[this.btm];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think using const within an if used to cause some issues with V8. I do not know if this was fixed recently or not. cc @bmeurer

@mcollina mcollina requested a review from bmeurer February 7, 2018 13:17
@mcollina
Copy link
Member

mcollina commented Feb 7, 2018

Copy link
Member

@addaleax addaleax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work!

constructor() {
this.list = [];
this.top = 0;
this.btm = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Can you spell out bottom here?

Copy link
Member

@benjamingr benjamingr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fwiw this is what I was talking about @apapirovski

this.list.push(tick);
this.top++;
} else {
this.list[this.top++] = tick;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer it if this was split to two lines

shift() {
if (this.btm < this.top) {
var next = this.list[this.btm];
this.list[this.btm++] = undefined;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer it if this line was split into two

@apapirovski
Copy link
Member

@benjamingr Sorry, I guess I misunderstood. This just looks a queue implementation with sane default size rather than a circular buffer. I think the major gains here are from far less frequent GC and not creating extra objects.

@mafintosh
Copy link
Member Author

@apapirovski @benjamingr funny, i actually just changed it to be a circular buffer to squeeze even more perf out of it :D

@mafintosh
Copy link
Member Author

Updated the benchmark results with the latest iteration. Faster on all points now, and much faster on some! \o/

@mafintosh mafintosh force-pushed the faster-next-tick branch 2 times, most recently from 64d02b1 to ff9b679 Compare February 7, 2018 14:21
Copy link
Member

@BridgeAR BridgeAR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!


function push(data) {
if (head.list[head.top] !== undefined)
head = head.next = new FixedQueue(head.size);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would just use a constant for 2048 and use that here. That way the constructor does not need the size entry.

@mafintosh
Copy link
Member Author

@BridgeAR fixed your comments, even cleaner now :)

@benjamingr
Copy link
Member

@mafintosh just wondering, do you think it would be possible (no pressure) to add a benchmark to test whether or not the perf gains are from less GC or from better cache locality of the buffer/queue?

Me and @apapirovski had a (purely theoretical and interesting) discussion a while back about linked lists vs circular buffers and I'm curious what the gain is from.

@mafintosh
Copy link
Member Author

@benjamingr unsure how to test that actually. side note, we should see if the stream BufferList can get faster using the same approach

@benjamingr
Copy link
Member

@mafintosh and I suspect timers as well - but one step at a time :)

@nodejs nodejs deleted a comment from addaleax Feb 7, 2018
@apapirovski
Copy link
Member

@addaleax Sorry, I deleted an outdated review comment of yours instead of mine.

Copy link
Member

@apapirovski apapirovski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Messed up last time.)

This is great work. I like the fact that this doesn't resize the circular buffer and instead keeps basically a linked list of new instances.

Copy link
Member

@bmeurer bmeurer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice

const ret = this.head.data;
if (this.head === this.tail) {
this.head = this.tail = null;
var next = this.list[this.btm];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can safely use const here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, was wondering if that came with a perf cost.

}

function shift() {
var next = tail.shift();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, also const.

if (head.list[head.top] !== undefined)
head = head.next = new FixedQueue();
head.push(data);
if (tickInfo[kHasScheduled] === 0)
Copy link
Member

@apapirovski apapirovski Feb 7, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Manipulating and accessing the Aliased Buffer is somewhat expensive, from my past testing, is there any other way to check this condition? I'm guessing tail.top === tail.btm or something similar might do the trick?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, ya lots of easy ways. Wasn't aware that this was expensive :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, it's not a huge cost or anything but unless we need to specifically check the data in it, it's better to use other available conditions. Since push is a hot path, this could make a small difference.


class FixedQueue {
constructor() {
this.btm = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the review comment that got deleted was

Nit: Can you spell out bottom here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry 😞

@apapirovski
Copy link
Member

For what it's worth, at bluebird we never shrinked the array and 4.5 years and over 100M installs later no one really complained about it. Since the memory is just reserved for reuse it didn't end up being a problem in practice for people.

The problem is that we've had people complain in the past about performance when scheduling way too many next ticks (like 1e7 or 1e8). There's even a test that exists because of it... 😆That's why this particular implementation is appealing to me since the circular buffer doesn't expand its size (instead more of them are created) so once they're processed, the memory usage can go back down again.


In general, a lot of the code in Node has to be overly defensive because of really strange use cases out there... EventEmitter could be like 100% faster if we didn't need to assume that someone might use it to schedule millions of unique event names. 😭

@lpinca
Copy link
Member

lpinca commented Feb 7, 2018

For what it's worth, at bluebird we never shrinked the array and 4.5 years and over 100M installs later no one really complained about it. Since the memory is just reserved for reuse it didn't end up being a problem in practice for people.

If ignoring shrinking is not a problem, I think the code I posted could be just dropped in to have a faster BufferList.

That's why this particular implementation is appealing to me since the circular buffer doesn't expand its size (instead more of them are created) so once they're processed, the memory usage can go back down again.

Agreed this is great.

Copy link
Member

@lpinca lpinca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job.

else
tickInfo[kHasScheduled] = 1;
}
head.push(data);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@apapirovski was able to simplify the tickInfo stuff quite a bit :)

@mafintosh
Copy link
Member Author

@addaleax @mcollina this is LGTM from me also now

@addaleax
Copy link
Member

addaleax commented Feb 8, 2018

@BridgeAR BridgeAR added the author ready PRs that have at least one approval, no pending requests for changes, and a CI started. label Feb 8, 2018
@mcollina
Copy link
Member

mcollina commented Feb 9, 2018

Landed as 240f9a2 and da97a47

@mcollina mcollina closed this Feb 9, 2018
mcollina pushed a commit that referenced this pull request Feb 9, 2018
PR-URL: #18617
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: Benjamin Gruenbaum <[email protected]>
Reviewed-By: Ruben Bridgewater <[email protected]>
Reviewed-By: Anatoli Papirovski <[email protected]>
Reviewed-By: Benedikt Meurer <[email protected]>
Reviewed-By: Matteo Collina <[email protected]>
Reviewed-By: Luigi Pinca <[email protected]>
Reviewed-By: Tiancheng "Timothy" Gu <[email protected]>
mcollina pushed a commit that referenced this pull request Feb 9, 2018
PR-URL: #18617
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: Benjamin Gruenbaum <[email protected]>
Reviewed-By: Ruben Bridgewater <[email protected]>
Reviewed-By: Anatoli Papirovski <[email protected]>
Reviewed-By: Benedikt Meurer <[email protected]>
Reviewed-By: Matteo Collina <[email protected]>
Reviewed-By: Luigi Pinca <[email protected]>
Reviewed-By: Tiancheng "Timothy" Gu <[email protected]>
@MylesBorins
Copy link
Contributor

Should this be backported to v9.x-staging? If yes please follow the guide and raise a backport PR, if not let me know or add the dont-land-on label.

@mcollina
Copy link
Member

This should not be backported to node 4 and 6.

@addaleax addaleax removed backport-requested-v9.x author ready PRs that have at least one approval, no pending requests for changes, and a CI started. labels Feb 27, 2018
addaleax pushed a commit to addaleax/node that referenced this pull request Feb 27, 2018
PR-URL: nodejs#18617
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: Benjamin Gruenbaum <[email protected]>
Reviewed-By: Ruben Bridgewater <[email protected]>
Reviewed-By: Anatoli Papirovski <[email protected]>
Reviewed-By: Benedikt Meurer <[email protected]>
Reviewed-By: Matteo Collina <[email protected]>
Reviewed-By: Luigi Pinca <[email protected]>
Reviewed-By: Tiancheng "Timothy" Gu <[email protected]>
addaleax pushed a commit to addaleax/node that referenced this pull request Feb 27, 2018
PR-URL: nodejs#18617
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: Benjamin Gruenbaum <[email protected]>
Reviewed-By: Ruben Bridgewater <[email protected]>
Reviewed-By: Anatoli Papirovski <[email protected]>
Reviewed-By: Benedikt Meurer <[email protected]>
Reviewed-By: Matteo Collina <[email protected]>
Reviewed-By: Luigi Pinca <[email protected]>
Reviewed-By: Tiancheng "Timothy" Gu <[email protected]>
@addaleax addaleax mentioned this pull request Feb 27, 2018
MayaLekova pushed a commit to MayaLekova/node that referenced this pull request May 8, 2018
PR-URL: nodejs#18617
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: Benjamin Gruenbaum <[email protected]>
Reviewed-By: Ruben Bridgewater <[email protected]>
Reviewed-By: Anatoli Papirovski <[email protected]>
Reviewed-By: Benedikt Meurer <[email protected]>
Reviewed-By: Matteo Collina <[email protected]>
Reviewed-By: Luigi Pinca <[email protected]>
Reviewed-By: Tiancheng "Timothy" Gu <[email protected]>
MayaLekova pushed a commit to MayaLekova/node that referenced this pull request May 8, 2018
PR-URL: nodejs#18617
Reviewed-By: Anna Henningsen <[email protected]>
Reviewed-By: Benjamin Gruenbaum <[email protected]>
Reviewed-By: Ruben Bridgewater <[email protected]>
Reviewed-By: Anatoli Papirovski <[email protected]>
Reviewed-By: Benedikt Meurer <[email protected]>
Reviewed-By: Matteo Collina <[email protected]>
Reviewed-By: Luigi Pinca <[email protected]>
Reviewed-By: Tiancheng "Timothy" Gu <[email protected]>
@MylesBorins
Copy link
Contributor

opting to not land in v8.x

Please lmk if we should reconsider

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
benchmark Issues and PRs related to the benchmark subsystem. performance Issues and PRs related to the performance of Node.js. process Issues and PRs related to the process subsystem.
Projects
None yet
Development

Successfully merging this pull request may close these issues.