-
Notifications
You must be signed in to change notification settings - Fork 30k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
src: added big-endian check and support to UTF-16 encode #3410
Conversation
@@ -857,7 +857,17 @@ Local<Value> StringBytes::Encode(Isolate* isolate, | |||
const uint16_t* buf, | |||
size_t buflen) { | |||
Local<String> val; | |||
|
|||
uint16_t dst[buflen]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't do this. buflen
may be on the order of 100s of megabytes.
Instead of copy/pasting the byte-swapping code, it would be much better to move it into a small utility class. |
// http://nodejs.org/api/buffer.html regarding Node's "ucs2" | ||
// encoding specification | ||
for (size_t i = 0; i < buflen; i++) { | ||
dst[i] = (buf[i] << 8) | (buf[i] >> 8); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be worth using intrinsics to ensure efficient byteswap instructions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no need. gcc and clang recognize the byte swap just fine.
agree with @bnoordhuis ... there are a few places in the code now where we do this kind of reordering. We should be refactoring it down rather than cutting and pasting it elsewhere. |
4564f5e
to
d4380a1
Compare
@srl295 @bnoordhuis ... @exinfinitum ... just an fyi... we don't get notified automatically when a PR has been updated with a new push. It's helpful if you comment and mention us after the new push so we can get notified and come back to review. |
@@ -23,7 +24,6 @@ using v8::String; | |||
using v8::Value; | |||
using v8::MaybeLocal; | |||
|
|||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No whitespace changes please.
Commit log style nit: please see CONTRIBUTING.md for how it should be written. |
PR Updated |
@@ -3,9 +3,11 @@ | |||
#include "node.h" | |||
#include "node_buffer.h" | |||
#include "v8.h" | |||
#include "util.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check to see if we need to include util.h here?
Updated |
@@ -857,7 +856,15 @@ Local<Value> StringBytes::Encode(Isolate* isolate, | |||
const uint16_t* buf, | |||
size_t buflen) { | |||
Local<String> val; | |||
|
|||
std::vector<uint16_t> dst(buflen); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd declare an empty std::vector here and .resize() it inside the if block. That avoids the memory allocation on little-endian architectures.
9e0881f
to
cb4874c
Compare
Updated |
static_assert(sizeof(TypeName) <= 8 | ||
&& (sizeof(TypeName) & 0x1) == 0, "Unsupported type"); | ||
|
||
// reinterpret src as unsigned because we want unsigned shifts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is no shift operation anymore :)
Versions of Node.js after v0.12 have relocated byte-swapping away from the StringBytes::Encode function, thereby causing a nan test (which accesses this function directly) to fail on big-endian machines. This change re-introduces byte swapping in StringBytes::Encode, done via a call to a function in util-inl. Another change in NodeBuffer::StringSlice was necessary to avoid double byte swapping in big-endian function calls to StringSlice. PR-URL: #3410 Reviewed-By: Ben Noordhuis <[email protected]> Reviewed-By: Trevor Norris <[email protected]>
Notable changes: This update to the LTS line includes a number of semver minor changes that have been staged for a number of months. This includes: * deps: backport 9da3ab6 from V8 upstream (Ali Ijaz Sheikh) - #3609 * http: handle errors on idle sockets (José F. Romaniello) - #4482 * src: add BE support to StringBytes::Encode() (Bryon Leung) - #3410 * tls: add `options` argument to createSecurePair (Коренберг Марк) - #2441 There are also quite a large number of semver patch changes including over 20 doc fixes and almost 50 test fixes. Notable semver patch changes include: * deps: upgrade to npm 2.14.18 (Kat Marchán) - #5245 * https: evict cached sessions on error (Fedor Indutny) - #4982 * process: support symbol events (cjihrig) - #4798 * querystring: improve parse() performance (Brian White) - #4675 PR-URL: #5301
In December we announced that we would be doing a minor release in order to get a number of voted on SEMVER-MINOR changes into LTS. Our ability to release this was delayed due to the unforeseen security release v4.3. We are quickly bumping to v4.4 in order to bring you the features that we had committed to releasing. Notable changes: The SEMVER-MINOR changes include: * **deps**: - An update to v8 that introduces a new flag --perf_basic_prof_only_functions (Ali Ijaz Sheikh) #3609 * **http**: - A new feature in http(s) agent that catches errors on *keep alived* connections (José F. Romaniello) #4482 * **src**: - Better support for Big-Endian systems (Bryon Leung) #3410 * **tls**: - A new feature that allows you to pass common SSL options to `tls.createSecurePair` (Коренберг Марк) #2441 Notable semver patch changes include: * **npm** - upgrade to npm 2.14.19 (Kat Marchán) #5335 * **https**: - A potential fix for #3692) HTTP/HTTPS client requests throwing EPROTO (Fedor Indutny) #4982 * **process**: - Add support for symbols in event emitters. Symbols didn't exist when it was written ¯\_(ツ)_/¯ (cjihrig) #4798 * **querystring**: - querystring.parse() is now 13-22% faster! (Brian White) #4675 PR-URL: #5301
In December we announced that we would be doing a minor release in order to get a number of voted on SEMVER-MINOR changes into LTS. Our ability to release this was delayed due to the unforeseen security release v4.3. We are quickly bumping to v4.4 in order to bring you the features that we had committed to releasing. This release also includes security updates to openssl. More information can be found [on nodejs.org](https://nodejs.org/en/blog/vulnerability/openssl-march-2016/) This release also includes over 70 fixes to our docs and over 50 fixes to tests. The SEMVER-MINOR changes include: * deps: - An update to v8 that introduces a new flag --perf_basic_prof_only_functions (Ali Ijaz Sheikh) #3609 * http: - A new feature in http(s) agent that catches errors on *keep alived* connections (José F. Romaniello) #4482 * src: - Better support for Big-Endian systems (Bryon Leung) #3410 * tls: - A new feature that allows you to pass common SSL options to `tls.createSecurePair` (Коренберг Марк) #2441 * tools - a new flag `--prof-process` which will execute the tick processor on the provided isolate files (Matt Loring) #4021 Notable semver patch changes include: * buld: - Support python path that includes spaces. This should be of particular interest to our Windows users who may have python living in `c:/Program Files` (Felix Becker) #4841 * https: - A potential fix for #3692 HTTP/HTTPS client requests throwing EPROTO (Fedor Indutny) #4982 * installer: - More readable profiling information from isolate tick logs (Matt Loring) #3032 * *npm: - upgrade to npm 2.14.20 (Kat Marchán) #5510 * *openssl: - upgrade openssl to 1.0.2g (Ben Noordhuis) #5507 * process: - Add support for symbols in event emitters. Symbols didn't exist when it was written ¯\_(ツ)_/¯ (cjihrig) #4798 * querystring: - querystring.parse() is now 13-22% faster! (Brian White) #4675 * streams: - performance improvements for moving small buffers that shows a 5% throughput gain. IoT projects have been seen to be as much as 10% faster with this change! (Matteo Collina) #4354 * tools: - eslint has been updated to version 2.1.0 (Rich Trott) #5214 PR-URL: #5301
In December we announced that we would be doing a minor release in order to get a number of voted on SEMVER-MINOR changes into LTS. Our ability to release this was delayed due to the unforeseen security release v4.3. We are quickly bumping to v4.4 in order to bring you the features that we had committed to releasing. This release also includes security updates to openssl. More information can be found [on nodejs.org](https://nodejs.org/en/blog/vulnerability/openssl-march-2016/) This release also includes over 70 fixes to our docs and over 50 fixes to tests. The SEMVER-MINOR changes include: * deps: - An update to v8 that introduces a new flag --perf_basic_prof_only_functions (Ali Ijaz Sheikh) nodejs#3609 * http: - A new feature in http(s) agent that catches errors on *keep alived* connections (José F. Romaniello) nodejs#4482 * src: - Better support for Big-Endian systems (Bryon Leung) nodejs#3410 * tls: - A new feature that allows you to pass common SSL options to `tls.createSecurePair` (Коренберг Марк) nodejs#2441 * tools - a new flag `--prof-process` which will execute the tick processor on the provided isolate files (Matt Loring) nodejs#4021 Notable semver patch changes include: * buld: - Support python path that includes spaces. This should be of particular interest to our Windows users who may have python living in `c:/Program Files` (Felix Becker) nodejs#4841 * https: - A potential fix for nodejs#3692 HTTP/HTTPS client requests throwing EPROTO (Fedor Indutny) nodejs#4982 * installer: - More readable profiling information from isolate tick logs (Matt Loring) nodejs#3032 * *npm: - upgrade to npm 2.14.20 (Kat Marchán) nodejs#5510 * *openssl: - upgrade openssl to 1.0.2g (Ben Noordhuis) nodejs#5507 * process: - Add support for symbols in event emitters. Symbols didn't exist when it was written ¯\_(ツ)_/¯ (cjihrig) nodejs#4798 * querystring: - querystring.parse() is now 13-22% faster! (Brian White) nodejs#4675 * streams: - performance improvements for moving small buffers that shows a 5% throughput gain. IoT projects have been seen to be as much as 10% faster with this change! (Matteo Collina) nodejs#4354 * tools: - eslint has been updated to version 2.1.0 (Rich Trott) nodejs#5214 PR-URL: nodejs#5301
Versions of Node.js after v0.12 have relocated byte-swapping away from the StringBytes::Encode function, thereby causing a nan test (which accesses this function directly) to fail on big-endian machines. This change re-introduces byte swapping in StringBytes::Encode, done via a call to a function in util-inl. Another change in NodeBuffer::StringSlice was necessary to avoid double byte swapping in big-endian function calls to StringSlice. PR-URL: #3410 Reviewed-By: Ben Noordhuis <[email protected]> Reviewed-By: Trevor Norris <[email protected]>
In December we announced that we would be doing a minor release in order to get a number of voted on SEMVER-MINOR changes into LTS. Our ability to release this was delayed due to the unforeseen security release v4.3. We are quickly bumping to v4.4 in order to bring you the features that we had committed to releasing. This release also includes over 70 fixes to our docs and over 50 fixes to tests. The SEMVER-MINOR changes include: * deps: - An update to v8 that introduces a new flag --perf_basic_prof_only_functions (Ali Ijaz Sheikh) #3609 * http: - A new feature in http(s) agent that catches errors on *keep alived* connections (José F. Romaniello) #4482 * src: - Better support for Big-Endian systems (Bryon Leung) #3410 * tls: - A new feature that allows you to pass common SSL options to `tls.createSecurePair` (Коренберг Марк) #2441 * tools - a new flag `--prof-process` which will execute the tick processor on the provided isolate files (Matt Loring) #4021 Notable semver patch changes include: * buld: - Support python path that includes spaces. This should be of particular interest to our Windows users who may have python living in `c:/Program Files` (Felix Becker) #4841 * https: - A potential fix for #3692 HTTP/HTTPS client requests throwing EPROTO (Fedor Indutny) #4982 * installer: - More readable profiling information from isolate tick logs (Matt Loring) #3032 * *npm: - upgrade to npm 2.14.20 (Kat Marchán) #5510 * process: - Add support for symbols in event emitters. Symbols didn't exist when it was written ¯\_(ツ)_/¯ (cjihrig) #4798 * querystring: - querystring.parse() is now 13-22% faster! (Brian White) #4675 * streams: - performance improvements for moving small buffers that shows a 5% throughput gain. IoT projects have been seen to be as much as 10% faster with this change! (Matteo Collina) #4354 * tools: - eslint has been updated to version 2.1.0 (Rich Trott) #5214 PR-URL: #5301
In December we announced that we would be doing a minor release in order to get a number of voted on SEMVER-MINOR changes into LTS. Our ability to release this was delayed due to the unforeseen security release v4.3. We are quickly bumping to v4.4 in order to bring you the features that we had committed to releasing. This release also includes over 70 fixes to our docs and over 50 fixes to tests. The SEMVER-MINOR changes include: * deps: - An update to v8 that introduces a new flag --perf_basic_prof_only_functions (Ali Ijaz Sheikh) #3609 * http: - A new feature in http(s) agent that catches errors on *keep alived* connections (José F. Romaniello) #4482 * src: - Better support for Big-Endian systems (Bryon Leung) #3410 * tls: - A new feature that allows you to pass common SSL options to `tls.createSecurePair` (Коренберг Марк) #2441 * tools - a new flag `--prof-process` which will execute the tick processor on the provided isolate files (Matt Loring) #4021 Notable semver patch changes include: * buld: - Support python path that includes spaces. This should be of particular interest to our Windows users who may have python living in `c:/Program Files` (Felix Becker) #4841 * https: - A potential fix for #3692 HTTP/HTTPS client requests throwing EPROTO (Fedor Indutny) #4982 * installer: - More readable profiling information from isolate tick logs (Matt Loring) #3032 * *npm: - upgrade to npm 2.14.20 (Kat Marchán) #5510 * process: - Add support for symbols in event emitters. Symbols didn't exist when it was written ¯\_(ツ)_/¯ (cjihrig) #4798 * querystring: - querystring.parse() is now 13-22% faster! (Brian White) #4675 * streams: - performance improvements for moving small buffers that shows a 5% throughput gain. IoT projects have been seen to be as much as 10% faster with this change! (Matteo Collina) #4354 * tools: - eslint has been updated to version 2.1.0 (Rich Trott) #5214 PR-URL: #5301
In December we announced that we would be doing a minor release in order to get a number of voted on SEMVER-MINOR changes into LTS. Our ability to release this was delayed due to the unforeseen security release v4.3. We are quickly bumping to v4.4 in order to bring you the features that we had committed to releasing. This release also includes over 70 fixes to our docs and over 50 fixes to tests. The SEMVER-MINOR changes include: * deps: - An update to v8 that introduces a new flag --perf_basic_prof_only_functions (Ali Ijaz Sheikh) #3609 * http: - A new feature in http(s) agent that catches errors on *keep alived* connections (José F. Romaniello) #4482 * src: - Better support for Big-Endian systems (Bryon Leung) #3410 * tls: - A new feature that allows you to pass common SSL options to `tls.createSecurePair` (Коренберг Марк) #2441 * tools - a new flag `--prof-process` which will execute the tick processor on the provided isolate files (Matt Loring) #4021 Notable semver patch changes include: * buld: - Support python path that includes spaces. This should be of particular interest to our Windows users who may have python living in `c:/Program Files` (Felix Becker) #4841 * https: - A potential fix for #3692 HTTP/HTTPS client requests throwing EPROTO (Fedor Indutny) #4982 * installer: - More readable profiling information from isolate tick logs (Matt Loring) #3032 * *npm: - upgrade to npm 2.14.20 (Kat Marchán) #5510 * process: - Add support for symbols in event emitters. Symbols didn't exist when it was written ¯\_(ツ)_/¯ (cjihrig) #4798 * querystring: - querystring.parse() is now 13-22% faster! (Brian White) #4675 * streams: - performance improvements for moving small buffers that shows a 5% throughput gain. IoT projects have been seen to be as much as 10% faster with this change! (Matteo Collina) #4354 * tools: - eslint has been updated to version 2.1.0 (Rich Trott) #5214 PR-URL: #5301
In December we announced that we would be doing a minor release in order to get a number of voted on SEMVER-MINOR changes into LTS. Our ability to release this was delayed due to the unforeseen security release v4.3. We are quickly bumping to v4.4 in order to bring you the features that we had committed to releasing. This release also includes over 70 fixes to our docs and over 50 fixes to tests. The SEMVER-MINOR changes include: * deps: - An update to v8 that introduces a new flag --perf_basic_prof_only_functions (Ali Ijaz Sheikh) #3609 * http: - A new feature in http(s) agent that catches errors on *keep alived* connections (José F. Romaniello) #4482 * src: - Better support for Big-Endian systems (Bryon Leung) #3410 * tls: - A new feature that allows you to pass common SSL options to `tls.createSecurePair` (Коренберг Марк) #2441 * tools - a new flag `--prof-process` which will execute the tick processor on the provided isolate files (Matt Loring) #4021 Notable semver patch changes include: * buld: - Support python path that includes spaces. This should be of particular interest to our Windows users who may have python living in `c:/Program Files` (Felix Becker) #4841 * https: - A potential fix for #3692 HTTP/HTTPS client requests throwing EPROTO (Fedor Indutny) #4982 * installer: - More readable profiling information from isolate tick logs (Matt Loring) #3032 * *npm: - upgrade to npm 2.14.20 (Kat Marchán) #5510 * process: - Add support for symbols in event emitters. Symbols didn't exist when it was written ¯\_(ツ)_/¯ (cjihrig) #4798 * querystring: - querystring.parse() is now 13-22% faster! (Brian White) #4675 * streams: - performance improvements for moving small buffers that shows a 5% throughput gain. IoT projects have been seen to be as much as 10% faster with this change! (Matteo Collina) #4354 * tools: - eslint has been updated to version 2.1.0 (Rich Trott) #5214 PR-URL: #5301
for (size_t i = 0; i < nchars; i++) | ||
dst[i] = dst[i] << 8 | dst[i] >> 8; | ||
break; | ||
SwapBytes(dst, dst, nchars); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was pointed out in #7645 that this change introduces a bug: the break
statement should have been retained because now the bytes are swapped back by the fallback path below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing out! Does anyone already develop a fix? if not, I can do the favour.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#7645 should fix this when it lands but I've made a note in case that PR stalls.
Between nodejs#3410 and nodejs#7645, bytes were swapped twice on bigendian platforms if buffer was not two-byte aligned. See comment in nodejs#7645.
Removes use of builtins that are unavailable for older clang. Per benchmarks, only uses builtins on Windows, where speedup is significant. Also adds test for unaligned ucs2 buffer write. Between #3410 and #7645, bytes were swapped twice on bigendian platforms if buffer was not two-byte aligned. See comment in #7645. PR-URL: #7645 Fixes: #7618 Reviewed-By: Anna Henningsen <[email protected]> Reviewed-By: Ben Noordhuis <[email protected]> Reviewed-By: James M Snell <[email protected]>
Removes use of builtins that are unavailable for older clang. Per benchmarks, only uses builtins on Windows, where speedup is significant. Also adds test for unaligned ucs2 buffer write. Between #3410 and #7645, bytes were swapped twice on bigendian platforms if buffer was not two-byte aligned. See comment in #7645. PR-URL: #7645 Fixes: #7618 Reviewed-By: Anna Henningsen <[email protected]> Reviewed-By: Ben Noordhuis <[email protected]> Reviewed-By: James M Snell <[email protected]>
Removes use of builtins that are unavailable for older clang. Per benchmarks, only uses builtins on Windows, where speedup is significant. Also adds test for unaligned ucs2 buffer write. Between #3410 and #7645, bytes were swapped twice on bigendian platforms if buffer was not two-byte aligned. See comment in #7645. PR-URL: #7645 Fixes: #7618 Reviewed-By: Anna Henningsen <[email protected]> Reviewed-By: Ben Noordhuis <[email protected]> Reviewed-By: James M Snell <[email protected]> Conflicts: src/node_buffer.cc
Added for loop to Encode function to support big-endian machines.
Used stack allocation instead of heap allocation to improve speed of endianness correction.
Resolves #3409