-
Notifications
You must be signed in to change notification settings - Fork 30k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
worker: fix stream & termination race bug #42874
worker: fix stream & termination race bug #42874
Conversation
`OnStreamAfterReqFinished` uses `v8::Object::Has` to check if it needs to call `oncomplete`. `v8::Object::Has` needs to execute Javascript. However when worker threads are involved, `OnStreamAfterReqFinished` may be called after the worker thread termination has begun via `worker.terminate()`. This makes `v8::Object::Has` return `Nothing`, which triggers an assert. This diff fixes the issue by simply defaulting us to `false` in the case where `Nothing` is returned. This is sound because we can't execute `oncomplete` anyway as the isolate is terminating. Fixes: nodejs#38418
I used the repro in the referenced issue: https://github.com/Prinzhorn/electron-sigabrt Here's the backtrace on a debug build & also some GDB output:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to add a test?
However when worker threads are involved, OnStreamAfterReqFinished may
be called after the worker thread termination has begun via
worker.terminate(). This makes v8::Object::Has return Nothing,
which triggers an assert.
Why would calling this function after worker.terminate()
return Nothing
?
Possibly. The repro only crashes rarely (something like 1% of the time
Because
which in turn uses this macro:
and |
ce6ca47
to
cf200b5
Compare
OK. I was able to add a test. On my machine, it fails 100/100 times I styled this after some of the other "terminate" tests. cc @addaleax as |
Right - I mean what exception is causing this? IsExecutionTerminatingCheck() would return true only if there is an exception scheduled to be thrown. node/deps/v8/src/api/api-inl.h Lines 236 to 242 in 02e0c17
I'm not able to reproduce the crash locally with this test. 🤔 |
My thoughts were that the current flow is something like this:
So I think the answer to your question is "the I suppose another option here would be to check
Can you share what system you are on? For me, that test case reproduces 100% of the time on both a 2019 Macbook Pro (MacBookPro16,1 running Big Sur 11.6) and a cloud VM (AWS m5a.2xlarge, Ubuntu 20.04.3 LTS, Linux version 5.11.0-1027-aws). (It does seem to be pretty timing-dependent so looping could help with flakiness -- if I do it while |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for elaborating! This seems like the same case as #42910.
Can you share what system you are on?
I had initially tested this on my Intel MBP running Monterey 12.3.1 and later on Ubuntu 20.04 running on Parallels Linux because I saw your GDB logs and assumed that you were on Linux.
I have suggested a test that crashes on all of these platforms consistently.
Thank you for the suggestions! I pasted the test code in & changed to use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This comment was marked as outdated.
This comment was marked as outdated.
@addaleax PTAL. |
So it looks like the two CI builds both failed on the new test with a timeout:
Any thoughts here? it takes much less than 2 minutes on my system, but I suppose there's a lot of contention in CI:
|
I think you could try to quit the process in a |
https://ci.nodejs.org/job/node-test-commit-custom-suites-freestyle/ URLs are going to be the tests that run in Worker threads, so you'll want to try with
|
Thanks, that did it. I cribbed the |
Landed in 0fc1cf4 |
`OnStreamAfterReqFinished` uses `v8::Object::Has` to check if it needs to call `oncomplete`. `v8::Object::Has` needs to execute Javascript. However when worker threads are involved, `OnStreamAfterReqFinished` may be called after the worker thread termination has begun via `worker.terminate()`. This makes `v8::Object::Has` return `Nothing`, which triggers an assert. This diff fixes the issue by simply defaulting us to `false` in the case where `Nothing` is returned. This is sound because we can't execute `oncomplete` anyway as the isolate is terminating. Fixes: #38418 PR-URL: #42874 Reviewed-By: Darshan Sen <[email protected]> Reviewed-By: Anna Henningsen <[email protected]> Reviewed-By: Antoine du Hamel <[email protected]>
`OnStreamAfterReqFinished` uses `v8::Object::Has` to check if it needs to call `oncomplete`. `v8::Object::Has` needs to execute Javascript. However when worker threads are involved, `OnStreamAfterReqFinished` may be called after the worker thread termination has begun via `worker.terminate()`. This makes `v8::Object::Has` return `Nothing`, which triggers an assert. This diff fixes the issue by simply defaulting us to `false` in the case where `Nothing` is returned. This is sound because we can't execute `oncomplete` anyway as the isolate is terminating. Fixes: #38418 PR-URL: #42874 Reviewed-By: Darshan Sen <[email protected]> Reviewed-By: Anna Henningsen <[email protected]> Reviewed-By: Antoine du Hamel <[email protected]>
`OnStreamAfterReqFinished` uses `v8::Object::Has` to check if it needs to call `oncomplete`. `v8::Object::Has` needs to execute Javascript. However when worker threads are involved, `OnStreamAfterReqFinished` may be called after the worker thread termination has begun via `worker.terminate()`. This makes `v8::Object::Has` return `Nothing`, which triggers an assert. This diff fixes the issue by simply defaulting us to `false` in the case where `Nothing` is returned. This is sound because we can't execute `oncomplete` anyway as the isolate is terminating. Fixes: #38418 PR-URL: #42874 Reviewed-By: Darshan Sen <[email protected]> Reviewed-By: Anna Henningsen <[email protected]> Reviewed-By: Antoine du Hamel <[email protected]>
Somebody in #43304 asked if this could be backported to earlier releases (v14 / v16). I think that is a good idea. I am not sure how that process works, but it seems like this bug fix would cherry-pick cleanly onto those branches. How can I get these changes backported? |
node/doc/contributing/backporting-to-release-lines.md Lines 19 to 25 in a0440c9
It should be included in the next version of Node.js 16.x without any action from you; regarding v14.x, it's in maintenance mode so it won't land there unless someone opens a backport PR to |
`OnStreamAfterReqFinished` uses `v8::Object::Has` to check if it needs to call `oncomplete`. `v8::Object::Has` needs to execute Javascript. However when worker threads are involved, `OnStreamAfterReqFinished` may be called after the worker thread termination has begun via `worker.terminate()`. This makes `v8::Object::Has` return `Nothing`, which triggers an assert. This diff fixes the issue by simply defaulting us to `false` in the case where `Nothing` is returned. This is sound because we can't execute `oncomplete` anyway as the isolate is terminating. Fixes: #38418 PR-URL: #42874 Reviewed-By: Darshan Sen <[email protected]> Reviewed-By: Anna Henningsen <[email protected]> Reviewed-By: Antoine du Hamel <[email protected]>
`OnStreamAfterReqFinished` uses `v8::Object::Has` to check if it needs to call `oncomplete`. `v8::Object::Has` needs to execute Javascript. However when worker threads are involved, `OnStreamAfterReqFinished` may be called after the worker thread termination has begun via `worker.terminate()`. This makes `v8::Object::Has` return `Nothing`, which triggers an assert. This diff fixes the issue by simply defaulting us to `false` in the case where `Nothing` is returned. This is sound because we can't execute `oncomplete` anyway as the isolate is terminating. Fixes: #38418 PR-URL: #42874 Reviewed-By: Darshan Sen <[email protected]> Reviewed-By: Anna Henningsen <[email protected]> Reviewed-By: Antoine du Hamel <[email protected]>
`OnStreamAfterReqFinished` uses `v8::Object::Has` to check if it needs to call `oncomplete`. `v8::Object::Has` needs to execute Javascript. However when worker threads are involved, `OnStreamAfterReqFinished` may be called after the worker thread termination has begun via `worker.terminate()`. This makes `v8::Object::Has` return `Nothing`, which triggers an assert. This diff fixes the issue by simply defaulting us to `false` in the case where `Nothing` is returned. This is sound because we can't execute `oncomplete` anyway as the isolate is terminating. Fixes: nodejs/node#38418 PR-URL: nodejs/node#42874 Reviewed-By: Darshan Sen <[email protected]> Reviewed-By: Anna Henningsen <[email protected]> Reviewed-By: Antoine du Hamel <[email protected]>
OnStreamAfterReqFinished
usesv8::Object::Has
to check if it needsto call
oncomplete
.v8::Object::Has
needs to execute Javascript.However when worker threads are involved,
OnStreamAfterReqFinished
maybe called after the worker thread termination has begun via
worker.terminate()
. This makesv8::Object::Has
returnNothing
,which triggers an assert.
This diff fixes the issue by checking
env->is_stopping()
first.Fixes: #38418