-
Notifications
You must be signed in to change notification settings - Fork 30k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix some EOF-handling issues in TLS #44563
Conversation
Like errno, OpenSSL's API requires SSL_get_error and error queue be checked immediately after the failing operation, otherwise the error queue or SSL object may have changed state and no longer report information about the operation the caller wanted. TLSWrap almost heeds this rule, except in TLSWrap::ClearOut. If SSL_read picks up a closing alert (detected by checking SSL_get_shutdown), Node calls out to JS with EmitRead(UV_EOF) and only afterwards proceeds to dispatch on the error. But, by this point, Node has already re-entered JS, which may change the error. In particular, I've observed that, on close_notify, JS seems to sometimes call back into TLSWrap::DoShutdown, calling SSL_shutdown. (I think this comes from onStreamRead in stream_base_commons.js?) Instead, SSL_get_error and the error queue should be sampled earlier. Back in nodejs#1661, Node needed to account for GetSSLError being called after ssl_ was destroyed. This was the real cause. With this fixed, there's no need to account for this. (Any case where ssl_ may be destroyed before SSL_get_error is a case where ssl_ or the error queue could change state, so it's a bug either way.) This is the first of two fixes in error-handling here. The EmitRead(UV_EOF) seems to additionally swallow fatal alerts from the peer. Some of the ECONNRESET expectations in the tests aren't actually correct. The next commit will fix this as well.
Review requested:
|
SSL_RECEIVED_SHUTDOWN means not just close_notify or fatal alert. From what I can tell, this was just a mistake? OnStreamRead's comment suggests eof_ was intended to be for close_notify. This fixes a bug in TLSSocket error reporting that seems to have made it into existing tests. If we receive a fatal alert, EmitRead(UV_EOF) would, via onConnectEnd in _tls_wrap.js, synthesize an ECONNRESET before the alert itself is surfaced. As a result, TLS alerts received during the handshake are misreported by Node. See the tests that had to be updated as part of this.
bfa2092
to
030b8c2
Compare
(Updated the second commit to hopefully fix the lint errors.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RSLGTM
This should be reviewed by someone who is more familiar with this code and TLS. @bnoordhuis @jasnell? |
Friendly ping. Is this waiting on anything from my end? |
No. I'll try to ping @nodejs/crypto again. |
Landed in ef8aa88...8b89d4d |
Like errno, OpenSSL's API requires SSL_get_error and error queue be checked immediately after the failing operation, otherwise the error queue or SSL object may have changed state and no longer report information about the operation the caller wanted. TLSWrap almost heeds this rule, except in TLSWrap::ClearOut. If SSL_read picks up a closing alert (detected by checking SSL_get_shutdown), Node calls out to JS with EmitRead(UV_EOF) and only afterwards proceeds to dispatch on the error. But, by this point, Node has already re-entered JS, which may change the error. In particular, I've observed that, on close_notify, JS seems to sometimes call back into TLSWrap::DoShutdown, calling SSL_shutdown. (I think this comes from onStreamRead in stream_base_commons.js?) Instead, SSL_get_error and the error queue should be sampled earlier. Back in #1661, Node needed to account for GetSSLError being called after ssl_ was destroyed. This was the real cause. With this fixed, there's no need to account for this. (Any case where ssl_ may be destroyed before SSL_get_error is a case where ssl_ or the error queue could change state, so it's a bug either way.) This is the first of two fixes in error-handling here. The EmitRead(UV_EOF) seems to additionally swallow fatal alerts from the peer. Some of the ECONNRESET expectations in the tests aren't actually correct. The next commit will fix this as well. PR-URL: #44563 Reviewed-By: Luigi Pinca <[email protected]>
SSL_RECEIVED_SHUTDOWN means not just close_notify or fatal alert. From what I can tell, this was just a mistake? OnStreamRead's comment suggests eof_ was intended to be for close_notify. This fixes a bug in TLSSocket error reporting that seems to have made it into existing tests. If we receive a fatal alert, EmitRead(UV_EOF) would, via onConnectEnd in _tls_wrap.js, synthesize an ECONNRESET before the alert itself is surfaced. As a result, TLS alerts received during the handshake are misreported by Node. See the tests that had to be updated as part of this. PR-URL: #44563 Reviewed-By: Luigi Pinca <[email protected]>
Like errno, OpenSSL's API requires SSL_get_error and error queue be checked immediately after the failing operation, otherwise the error queue or SSL object may have changed state and no longer report information about the operation the caller wanted. TLSWrap almost heeds this rule, except in TLSWrap::ClearOut. If SSL_read picks up a closing alert (detected by checking SSL_get_shutdown), Node calls out to JS with EmitRead(UV_EOF) and only afterwards proceeds to dispatch on the error. But, by this point, Node has already re-entered JS, which may change the error. In particular, I've observed that, on close_notify, JS seems to sometimes call back into TLSWrap::DoShutdown, calling SSL_shutdown. (I think this comes from onStreamRead in stream_base_commons.js?) Instead, SSL_get_error and the error queue should be sampled earlier. Back in #1661, Node needed to account for GetSSLError being called after ssl_ was destroyed. This was the real cause. With this fixed, there's no need to account for this. (Any case where ssl_ may be destroyed before SSL_get_error is a case where ssl_ or the error queue could change state, so it's a bug either way.) This is the first of two fixes in error-handling here. The EmitRead(UV_EOF) seems to additionally swallow fatal alerts from the peer. Some of the ECONNRESET expectations in the tests aren't actually correct. The next commit will fix this as well. PR-URL: #44563 Reviewed-By: Luigi Pinca <[email protected]>
SSL_RECEIVED_SHUTDOWN means not just close_notify or fatal alert. From what I can tell, this was just a mistake? OnStreamRead's comment suggests eof_ was intended to be for close_notify. This fixes a bug in TLSSocket error reporting that seems to have made it into existing tests. If we receive a fatal alert, EmitRead(UV_EOF) would, via onConnectEnd in _tls_wrap.js, synthesize an ECONNRESET before the alert itself is surfaced. As a result, TLS alerts received during the handshake are misreported by Node. See the tests that had to be updated as part of this. PR-URL: #44563 Reviewed-By: Luigi Pinca <[email protected]>
Like errno, OpenSSL's API requires SSL_get_error and error queue be checked immediately after the failing operation, otherwise the error queue or SSL object may have changed state and no longer report information about the operation the caller wanted. TLSWrap almost heeds this rule, except in TLSWrap::ClearOut. If SSL_read picks up a closing alert (detected by checking SSL_get_shutdown), Node calls out to JS with EmitRead(UV_EOF) and only afterwards proceeds to dispatch on the error. But, by this point, Node has already re-entered JS, which may change the error. In particular, I've observed that, on close_notify, JS seems to sometimes call back into TLSWrap::DoShutdown, calling SSL_shutdown. (I think this comes from onStreamRead in stream_base_commons.js?) Instead, SSL_get_error and the error queue should be sampled earlier. Back in #1661, Node needed to account for GetSSLError being called after ssl_ was destroyed. This was the real cause. With this fixed, there's no need to account for this. (Any case where ssl_ may be destroyed before SSL_get_error is a case where ssl_ or the error queue could change state, so it's a bug either way.) This is the first of two fixes in error-handling here. The EmitRead(UV_EOF) seems to additionally swallow fatal alerts from the peer. Some of the ECONNRESET expectations in the tests aren't actually correct. The next commit will fix this as well. PR-URL: #44563 Reviewed-By: Luigi Pinca <[email protected]>
SSL_RECEIVED_SHUTDOWN means not just close_notify or fatal alert. From what I can tell, this was just a mistake? OnStreamRead's comment suggests eof_ was intended to be for close_notify. This fixes a bug in TLSSocket error reporting that seems to have made it into existing tests. If we receive a fatal alert, EmitRead(UV_EOF) would, via onConnectEnd in _tls_wrap.js, synthesize an ECONNRESET before the alert itself is surfaced. As a result, TLS alerts received during the handshake are misreported by Node. See the tests that had to be updated as part of this. PR-URL: #44563 Reviewed-By: Luigi Pinca <[email protected]>
This fixes a pair of issues with EOF handling in TLSWrap. This originally came up because @codebytere noticed an incompatibility with Node and a recent BoringSSL change, but the root cause is a bug in Node that also impacts how it uses OpenSSL. I've split it into two commits in case you all want to backport only one of them, as the second changes how some errors are surfaced. Node was previously incorrectly representing a class of TLS errors are
ECONNRESET
, dropping the true error on the ground.The changes are as follows:
tls: fix re-entrancy issue with TLS close_notify
Like
errno
, OpenSSL's API requiresSSL_get_error
and error queue be checked immediately after the failing operation, otherwise the error queue orSSL
object may have changed state and no longer report information about the operation the caller wanted.TLSWrap
almost heeds this rule, except inTLSWrap::ClearOut
. IfSSL_read
picks up a closing alert (detected by checkingSSL_get_shutdown
), Node calls out to JS withEmitRead(UV_EOF)
and only afterwards proceeds to dispatch on the error. But, by this point, Node has already re-entered JS, which may change the error.In particular, I've observed that, on close_notify, JS seems to sometimes call back into
TLSWrap::DoShutdown
, callingSSL_shutdown
. (I think this comes fromonStreamRead
in stream_base_commons.js?)Instead,
SSL_get_error
and the error queue should be sampled earlier. Back in #1661, Node needed to account forGetSSLError
being called afterssl_
was destroyed. This was the real cause. With this fixed, there's no need to account for this. (Any case wheressl_
may be destroyed beforeSSL_get_error
is a case wheressl_
or the error queue could change state, so it's a bug either way.)I've done this by just moving the calls up a bit, to preserve the existing C++ => JS behavior, but that revealed a bigger issue about
EmitRead(UV_EOF)
, which is the second commit.tls: don't treat fatal TLS alerts as EOF
SSL_RECEIVED_SHUTDOWN
means not just close_notify but also fatal alert. From what I can tell, treating fatal alert as EOF was just a mistake?OnStreamRead
's comment suggestseof_
was intended to be for close_notify.This fixes a bug in
TLSSocket
error reporting that seems to have made it into existing tests. If we receive a fatal alert,EmitRead(UV_EOF)
would, viaonConnectEnd
in _tls_wrap.js, synthesize an ECONNRESET before the alert itself is surfaced. As a result, TLS alerts received during the handshake are misreported by Node.See the tests that had to be updated as part of this.
I should note, while I'm familiar with TLS and OpenSSL, I'm not at all familiar with Node's streams machinery or sockets APIs. Do you all actually expect to emit EOF on error? This change removes an
EmitRead(UV_EOF)
on fatal TLS alert. Is that consistent with JS state? I'm assuming so, because there are plenty of other fatal errors whichTLSWrap
did not emit EOF on (anything case where we reject something from the peer, rather than seeing an alert message, will not setSSL_RECEIVED_SHUTDOWN
). So I assume this change just makes peer fatal alerts go through the same codepaths as other peer errors. But hopefully you all will be able to review this better.I suspect there's also a way to construct a test which demonstrates the first fix standalone. We ran into it in BoringSSL because, to fix some other bugs in OpenSSL's error-handling, we had to treat
SSL_ERROR_ZERO_RETURN
more like the otherSSL_ERROR_*
constants. OpenSSL is still quite lax about it, which is masking this bug in the particular scenario where we hit it. But there's probably a more interesting case where reentrancy causes Node to lose a TLS alert. Except you all seem to already be losing TLS alerts until the second commit anyway, so I'm not sure. :-)