-
Notifications
You must be signed in to change notification settings - Fork 30k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
inspector: simplify dispatchProtocolMessage #49780
inspector: simplify dispatchProtocolMessage #49780
Conversation
cc @nodejs/inspector @nodejs/cpp-reviewers |
Nit: the commit message does not follow our guidelines. Please change it to begin with an imperative verb after the subsystem prefix. |
I think at this point we can confidently say there will never be a binary message in inspector protocol 😀 (it is pretty much set in stone at this point). Why not just delete |
PR message seems misleading. Proposed change does not seem to touch any UTF8/UTF16 shenanigans unless I am missing something. |
I changed the commit and the PR name. |
It does. And that's a key point. We are saving some non-trivial processing. The current code calls std::unique_ptr<Value> parseJSON(const std::string_view string) {
if (string.empty())
return nullptr;
size_t expected_utf16_length =
simdutf::utf16_length_from_utf8(string.data(), string.length());
MaybeStackBuffer<char16_t> buffer(expected_utf16_length);
// simdutf::convert_utf8_to_utf16 returns zero in case of error.
size_t utf16_length = simdutf::convert_utf8_to_utf16(
string.data(), string.length(), buffer.out());
// We have that utf16_length == expected_utf16_length if and only
// if the input was a valid UTF-8 string.
if (utf16_length == 0) return nullptr; // We had an invalid UTF-8 input.
CHECK_EQ(expected_utf16_length, utf16_length);
return parseJSONCharacters(reinterpret_cast<const uint16_t*>(buffer.out()),
utf16_length);
} Observe how it converts to UTF-16. Instead, we call directly std::unique_ptr<Value> parseJSON(v8_inspector::StringView string) {
if (string.length() == 0)
return nullptr;
if (string.is8Bit())
return parseJSONCharacters(string.characters8(), string.length());
return parseJSONCharacters(string.characters16(), string.length());
} See how we are bypassing We are possibly skipping some allocation and some transcoding... It is not going to make a huge difference unless you are typing really fast on your keyboard, but... code simplification is always a positive, I would think. |
Yes. Let me try. |
@eugeneo Last commit removes parseMessage entirely since it is seemingly not needed. |
@addaleax @alexkozy Can you help? This PR get the following issue under macOS in CI:
If it is indeed an issue introduced by the current PR, I would like to be able to reproduce it. Running the script with Node 20 (public release), I get...
On my macBook, building the software, I get...
and
So everything is fine, as far as I can see. This code is built with the patch from the current PR, e.g., we have void dispatchProtocolMessage(const StringView& message) {
std::string raw_message = protocol::StringUtil::StringViewToUtf8(message);
per_process::Debug(DebugCategory::INSPECTOR_SERVER,
"[inspector received] %s\n",
raw_message);
std::unique_ptr<protocol::DictionaryValue> value =
protocol::DictionaryValue::cast(
protocol::StringUtil::parseJSON(message)); |
The macOS test seems to pass now. :-/ |
Landed in 2990390 |
PR-URL: nodejs#49780 Reviewed-By: Yagiz Nizipli <[email protected]>
PR-URL: nodejs#49780 Reviewed-By: Yagiz Nizipli <[email protected]>
PR-URL: #49780 Reviewed-By: Yagiz Nizipli <[email protected]>
PR-URL: nodejs#49780 Reviewed-By: Yagiz Nizipli <[email protected]>
This is in relation with bug #47457 where we found that under Windows, there seems to be a problem when converting from UTF-8 to UTF-16 a JSON message generated by key presses in the REPL.
Tracing it back, I found evidence that in common cases, each time you hit a key in the Node REPL, a call to
dispatchProtocolMessage
is made, passing an UTF-16 JSON message. The message is converted to UTF-8 into the local variableraw_message
, and then we callparseMessage
which converts backraw_message
into UTF-16 and callsparseJSON
on it.This PR avoids the conversion from UTF-8 back into UTF-16. This should save some processing that is currently done with each key press.
It is possible that it is a Chesterton's fence and that the complexity is somehow necessary. Nevertheless, if this PR is correct, it would be an interesting optimization for the REPL.