-
Notifications
You must be signed in to change notification settings - Fork 30k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
process: faster next tick #18617
process: faster next tick #18617
Conversation
The next-tick-breadth.js seem to be a bit slower cause it's hitting inserting more than 1024 elements so hitting the slow path of this impl. The same goes for the readable-readall bench, since that for some reason queues a lot of nextTicks |
Gonna do some investigation to see if I can fix those |
lib/internal/process/next_tick.js
Outdated
head = new SmallQueue(); | ||
} | ||
|
||
if (head.top === 0 && tail === null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tail === null
check is already in the above one. So it could also be written as:
if (tail === null) {
if (head.top => 1024 && head.btm > 0) {
// ...
} else if (head.top === 0) {
tickInfo...
}
}
lib/internal/process/next_tick.js
Outdated
|
||
if (tail !== null) { | ||
next = tail.shift(); | ||
if (next !== null) return next; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I am not mistaken this if (next !== null)
check could be removed by moving the if (this.btm < this.top) {
check from SmallQueue#shift()
in here and to always return a value when calling SmallQueue#shift()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (tail !== null) {
if (tail.btm < tail.top) return tail.shift();
tail = null;
}
lib/internal/process/next_tick.js
Outdated
|
||
shift() { | ||
if (this.btm < this.top) { | ||
var next = this.list[this.btm]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think using const
within an if used to cause some issues with V8. I do not know if this was fixed recently or not. cc @bmeurer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work!
lib/internal/process/next_tick.js
Outdated
constructor() { | ||
this.list = []; | ||
this.top = 0; | ||
this.btm = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Can you spell out bottom
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fwiw this is what I was talking about @apapirovski
lib/internal/process/next_tick.js
Outdated
this.list.push(tick); | ||
this.top++; | ||
} else { | ||
this.list[this.top++] = tick; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer it if this was split to two lines
lib/internal/process/next_tick.js
Outdated
shift() { | ||
if (this.btm < this.top) { | ||
var next = this.list[this.btm]; | ||
this.list[this.btm++] = undefined; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer it if this line was split into two
@benjamingr Sorry, I guess I misunderstood. This just looks a queue implementation with sane default size rather than a circular buffer. I think the major gains here are from far less frequent GC and not creating extra objects. |
ebbc8a8
to
d86f69b
Compare
@apapirovski @benjamingr funny, i actually just changed it to be a circular buffer to squeeze even more perf out of it :D |
Updated the benchmark results with the latest iteration. Faster on all points now, and much faster on some! \o/ |
64d02b1
to
ff9b679
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
lib/internal/process/next_tick.js
Outdated
|
||
function push(data) { | ||
if (head.list[head.top] !== undefined) | ||
head = head.next = new FixedQueue(head.size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would just use a constant for 2048
and use that here. That way the constructor does not need the size entry.
@BridgeAR fixed your comments, even cleaner now :) |
@mafintosh just wondering, do you think it would be possible (no pressure) to add a benchmark to test whether or not the perf gains are from less GC or from better cache locality of the buffer/queue? Me and @apapirovski had a (purely theoretical and interesting) discussion a while back about linked lists vs circular buffers and I'm curious what the gain is from. |
@benjamingr unsure how to test that actually. side note, we should see if the stream BufferList can get faster using the same approach |
@mafintosh and I suspect timers as well - but one step at a time :) |
@addaleax Sorry, I deleted an outdated review comment of yours instead of mine. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Messed up last time.)
This is great work. I like the fact that this doesn't resize the circular buffer and instead keeps basically a linked list of new instances.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice
lib/internal/process/next_tick.js
Outdated
const ret = this.head.data; | ||
if (this.head === this.tail) { | ||
this.head = this.tail = null; | ||
var next = this.list[this.btm]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can safely use const
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, was wondering if that came with a perf cost.
lib/internal/process/next_tick.js
Outdated
} | ||
|
||
function shift() { | ||
var next = tail.shift(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, also const
.
lib/internal/process/next_tick.js
Outdated
if (head.list[head.top] !== undefined) | ||
head = head.next = new FixedQueue(); | ||
head.push(data); | ||
if (tickInfo[kHasScheduled] === 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Manipulating and accessing the Aliased Buffer is somewhat expensive, from my past testing, is there any other way to check this condition? I'm guessing tail.top === tail.btm
or something similar might do the trick?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, ya lots of easy ways. Wasn't aware that this was expensive :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean, it's not a huge cost or anything but unless we need to specifically check the data in it, it's better to use other available conditions. Since push
is a hot path, this could make a small difference.
lib/internal/process/next_tick.js
Outdated
|
||
class FixedQueue { | ||
constructor() { | ||
this.btm = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the review comment that got deleted was
Nit: Can you spell out
bottom
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry 😞
The problem is that we've had people complain in the past about performance when scheduling way too many next ticks (like 1e7 or 1e8). There's even a test that exists because of it... 😆That's why this particular implementation is appealing to me since the circular buffer doesn't expand its size (instead more of them are created) so once they're processed, the memory usage can go back down again. In general, a lot of the code in Node has to be overly defensive because of really strange use cases out there... EventEmitter could be like 100% faster if we didn't need to assume that someone might use it to schedule millions of unique event names. 😭 |
If ignoring shrinking is not a problem, I think the code I posted could be just dropped in to have a faster
Agreed this is great. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job.
6b3f40e
to
213a5ab
Compare
else | ||
tickInfo[kHasScheduled] = 1; | ||
} | ||
head.push(data); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@apapirovski was able to simplify the tickInfo stuff quite a bit :)
PR-URL: #18617 Reviewed-By: Anna Henningsen <[email protected]> Reviewed-By: Benjamin Gruenbaum <[email protected]> Reviewed-By: Ruben Bridgewater <[email protected]> Reviewed-By: Anatoli Papirovski <[email protected]> Reviewed-By: Benedikt Meurer <[email protected]> Reviewed-By: Matteo Collina <[email protected]> Reviewed-By: Luigi Pinca <[email protected]> Reviewed-By: Tiancheng "Timothy" Gu <[email protected]>
PR-URL: #18617 Reviewed-By: Anna Henningsen <[email protected]> Reviewed-By: Benjamin Gruenbaum <[email protected]> Reviewed-By: Ruben Bridgewater <[email protected]> Reviewed-By: Anatoli Papirovski <[email protected]> Reviewed-By: Benedikt Meurer <[email protected]> Reviewed-By: Matteo Collina <[email protected]> Reviewed-By: Luigi Pinca <[email protected]> Reviewed-By: Tiancheng "Timothy" Gu <[email protected]>
Should this be backported to |
This should not be backported to node 4 and 6. |
PR-URL: nodejs#18617 Reviewed-By: Anna Henningsen <[email protected]> Reviewed-By: Benjamin Gruenbaum <[email protected]> Reviewed-By: Ruben Bridgewater <[email protected]> Reviewed-By: Anatoli Papirovski <[email protected]> Reviewed-By: Benedikt Meurer <[email protected]> Reviewed-By: Matteo Collina <[email protected]> Reviewed-By: Luigi Pinca <[email protected]> Reviewed-By: Tiancheng "Timothy" Gu <[email protected]>
PR-URL: nodejs#18617 Reviewed-By: Anna Henningsen <[email protected]> Reviewed-By: Benjamin Gruenbaum <[email protected]> Reviewed-By: Ruben Bridgewater <[email protected]> Reviewed-By: Anatoli Papirovski <[email protected]> Reviewed-By: Benedikt Meurer <[email protected]> Reviewed-By: Matteo Collina <[email protected]> Reviewed-By: Luigi Pinca <[email protected]> Reviewed-By: Tiancheng "Timothy" Gu <[email protected]>
PR-URL: nodejs#18617 Reviewed-By: Anna Henningsen <[email protected]> Reviewed-By: Benjamin Gruenbaum <[email protected]> Reviewed-By: Ruben Bridgewater <[email protected]> Reviewed-By: Anatoli Papirovski <[email protected]> Reviewed-By: Benedikt Meurer <[email protected]> Reviewed-By: Matteo Collina <[email protected]> Reviewed-By: Luigi Pinca <[email protected]> Reviewed-By: Tiancheng "Timothy" Gu <[email protected]>
PR-URL: nodejs#18617 Reviewed-By: Anna Henningsen <[email protected]> Reviewed-By: Benjamin Gruenbaum <[email protected]> Reviewed-By: Ruben Bridgewater <[email protected]> Reviewed-By: Anatoli Papirovski <[email protected]> Reviewed-By: Benedikt Meurer <[email protected]> Reviewed-By: Matteo Collina <[email protected]> Reviewed-By: Luigi Pinca <[email protected]> Reviewed-By: Tiancheng "Timothy" Gu <[email protected]>
opting to not land in v8.x Please lmk if we should reconsider |
Checklist
make -j4 test
(UNIX), orvcbuild test
(Windows) passesAffected core subsystem(s)
process
Moves the nextTick implementation to a series of fixed size circular arrays instead of the linked list. The size for each array is fixed at 2048, which seems to provide the best perf.
Since streams are big nextTick users these days, I added a pipe benchmark also to see the impact.
NextTick benchmark
Streams benchmark