Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

module: use Wasm CJS lexer when available #35583

Closed
wants to merge 3 commits into from

Conversation

guybedford
Copy link
Contributor

@guybedford guybedford commented Oct 10, 2020

This uses the Web Assembly version of the cjs-module-lexer whenever Web Assembly is available, avoiding the v8 compiler cold start that applies to the JS version, causing slowdowns for parsing multi-MB JS files cold as reported in #35574.

This will gracefully then fallback to the JS implementation when eg using the --jitless flag in Node.js.

This is the same version of the cjs-module-lexer codebase, and is only a performance patch.

I've also included a benchmark for the parser as well for any future work here (eg linking the native C in directly at some point again).

I've also added a graceful fallback if there are any Wasm execution issues.

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • tests and/or benchmarks are included
  • commit message follows commit guidelines

@nodejs-github-bot
Copy link
Collaborator

Review requested:

  • @nodejs/modules

@nodejs-github-bot nodejs-github-bot added build Issues and PRs related to build files or the CI. esm Issues and PRs related to the ECMAScript Modules implementation. labels Oct 10, 2020
@Trott Trott added the request-ci Add this label to start a Jenkins CI on a PR. label Oct 10, 2020
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Oct 10, 2020
@nodejs-github-bot
Copy link
Collaborator

Copy link
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@guybedford
Copy link
Contributor Author

Any ideas what the failing OS platforms have in common here? Could it be endian issues?

https://ci.nodejs.org/job/node-test-pull-request/33515/

@guybedford
Copy link
Contributor Author

I'm not sure we have any Wasm tests for Node.js, so it could also be general Wasm coverage issues even?

@guybedford
Copy link
Contributor Author

I've added a simple check to gracefully fallback to the JS implementation if there are any issues running Wasm.

@richardlau
Copy link
Member

Any ideas what the failing OS platforms have in common here? Could it be endian issues?

https://ci.nodejs.org/job/node-test-pull-request/33515/

AIX and LinuxONE are big endian so quite possibly.
cc @miladfarca @nodejs/platform-aix @nodejs/platform-s390

@nodejs-github-bot
Copy link
Collaborator

@guybedford
Copy link
Contributor Author

Looking at the failures, the Mac and Windows failures are both due to test flakes.

The genuine failures of the Wasm execution only seem to be linuxone and aix AFAICT.

Endianness shouldn't actually be an issue because Wasm is always little endian, so that's why I'm wondering if these are upstream Wasm issues. Not sure how to debug without access to one of these machines unfortunately.

Just rerunning CI with the graceful fallback. As a perf-only change, it seems fine to skip Wasm for these machines.

@richardlau
Copy link
Member

How was the WebAssembly generated? FWIW There is a known endian issue with emscripten (emscripten-core/emscripten#12387).

@guybedford
Copy link
Contributor Author

The Wasm is generated with LLVM, but as I say the Wasm virtual machine itself is always designed to be little endian even on big endian hardware. Hence why I'm wondering if these are more upstream issues.

@guybedford
Copy link
Contributor Author

CI is passing now though, so we are good to landing pending further approvals.

Copy link
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

guybedford added a commit that referenced this pull request Oct 12, 2020
PR-URL: #35583
Reviewed-By: Matteo Collina <[email protected]>
Reviewed-By: Geoffrey Booth <[email protected]>
@guybedford
Copy link
Contributor Author

Landed in f3ce281.

@miladfarca
Copy link
Contributor

Yes it seems to be due to endianness. wasm is LE enforced, however typed arrays in Javascript depend on the host endianness and are not portable, looking at the source of lexer.js:
https://github.com/guybedford/cjs-module-lexer/blob/master/src/lexer.js#L36
Uint16Array is being used which would behave differently on LE and BE machines. Could you try using a DataView instead and force LE ordering? It would look like this:

...
while (i < len){
   new DataView(outBuf16.buffer).setUint16(outBuf16.byteOffset + (2*i), src.charCodeAt(i++), true);
}

@guybedford
Copy link
Contributor Author

@miladfarca thank you so much for finding this, that does sound like it then! I've created a PR with big endian support at nodejs/cjs-module-lexer#13. If anyone has the hardware to test this on with an npm run build && npm run test that would be a huge help in verifying the fix.

MylesBorins pushed a commit that referenced this pull request Oct 14, 2020
PR-URL: #35583
Reviewed-By: Matteo Collina <[email protected]>
Reviewed-By: Geoffrey Booth <[email protected]>
@MylesBorins MylesBorins mentioned this pull request Oct 14, 2020
MylesBorins pushed a commit that referenced this pull request Nov 3, 2020
PR-URL: #35583
Reviewed-By: Matteo Collina <[email protected]>
Reviewed-By: Geoffrey Booth <[email protected]>
@MylesBorins MylesBorins mentioned this pull request Nov 3, 2020
MylesBorins pushed a commit that referenced this pull request Nov 16, 2020
PR-URL: #35583
Reviewed-By: Matteo Collina <[email protected]>
Reviewed-By: Geoffrey Booth <[email protected]>
joesepi pushed a commit to joesepi/node that referenced this pull request Jan 8, 2021
PR-URL: nodejs#35583
Reviewed-By: Matteo Collina <[email protected]>
Reviewed-By: Geoffrey Booth <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build Issues and PRs related to build files or the CI. esm Issues and PRs related to the ECMAScript Modules implementation.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants