Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

module: fix detect-module not retrying as ESM for code that errors only in CommonJS #52024

Merged

Conversation

GeoffreyBooth
Copy link
Member

Fixes #50917, using the solution described in #50917 (comment).

The general algorithm for --experimental-detect-module is to try to parse as CommonJS, and if a syntax error is thrown that corresponds to ESM syntax (import or export, or import.meta), try again as ESM. The edge case pointed out by #50917 is when there’s syntax that throws in CommonJS but parses in ESM, and this syntax is above the first ESM syntax (so top-level await, or a declaration of a variable with the same name as one of the CommonJS module wrapper variables such as const require =, on the first line or any line above the first import or export).

The tricky thing is that the errors thrown by top-level await or const require in CommonJS are the same errors as thrown by typing await in any ordinary sync function, or by declaring require twice in user code (e.g. const require = 1; const require = 2). So we only want to retry parsing as ESM when these errors are because the await or problematic variable declaration is at the top level, where it throws in CommonJS but parses in ESM.

To determine this, this PR creates a new error path where if the CommonJS parse returns a syntax error corresponding to either of these cases, we do a special second CommonJS parse of the code wrapped in an async function. This wrapper function creates a new scope, so const require is no longer a problematic redeclaration; and await no longer throws because it’s within an async function. This wrapper only affects the top level, so await in a sync function farther down or variable redeclarations farther down (or two const declarations at the top level) will still throw; but the code permitted in ESM parses successfully. If this second parse doesn’t throw any errors, we resume the detection algorithm and try parsing as ESM.

@nodejs/loaders @chjj @meyfa @joyeecheung

@GeoffreyBooth GeoffreyBooth added module Issues and PRs related to the module subsystem. esm Issues and PRs related to the ECMAScript Modules implementation. labels Mar 9, 2024
@nodejs-github-bot nodejs-github-bot added c++ Issues and PRs that require attention from people who are familiar with C++. needs-ci PRs that need a full CI run. vm Issues and PRs related to the vm subsystem. labels Mar 9, 2024
@nodejs-github-bot
Copy link
Collaborator

@GeoffreyBooth GeoffreyBooth force-pushed the fix-detect-module-cjs-exceptions branch 2 times, most recently from 3b2f28e to 680694b Compare March 10, 2024 02:18
@nodejs-github-bot
Copy link
Collaborator

@chjj
Copy link
Contributor

chjj commented Mar 10, 2024

I might not be reading the code correctly, but does this cover the case of:

await Promise.resolve();
import 'x';

It looks like the C++ code would try to handle the "await" syntax error first, wrap the code in an async function and recompile. The recompilation would then throw an "import" syntax error, ultimately causing should_retry_as_esm to remain false.

src/node_contextify.cc Outdated Show resolved Hide resolved
src/node_contextify.cc Outdated Show resolved Hide resolved
@GeoffreyBooth
Copy link
Member Author

I might not be reading the code correctly, but does this cover the case of:

Great catch @chjj, I’ve updated the PR to address that case. Any others that you can think of that I might have missed?

Copy link
Contributor

@guybedford guybedford left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking good! const module = 'x' as defining an ES module as an extension of the ESM syntax checks seems like the right kind of compromise to be making in the design space to me. I do hope we could optimize TLA in due course.

Note that when sloppy mode errors precede an identifier redeclaration there will still be error consistency, but as discussed separately there are ways to handle this by pointing to the correct error for the ESM parse, and this can be done as follow-up work if needed.

I would be happy to mark for approval as soon as we've got an updated note in the ESM docs describing the new format detection rules, as I think it's important to track the rules and their changes carefully.

src/node_contextify.cc Outdated Show resolved Hide resolved
@GeoffreyBooth
Copy link
Member Author

I would be happy to mark for approval as soon as we've got an updated note in the ESM docs describing the new format detection rules

Added.

doc/api/cli.md Outdated Show resolved Hide resolved
@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot
Copy link
Collaborator

@GeoffreyBooth GeoffreyBooth added the author ready PRs that have at least one approval, no pending requests for changes, and a CI started. label Mar 11, 2024
src/node_contextify.cc Outdated Show resolved Hide resolved
@GeoffreyBooth GeoffreyBooth force-pushed the fix-detect-module-cjs-exceptions branch from 93c2b94 to 3fbeb43 Compare March 11, 2024 23:07
@GeoffreyBooth GeoffreyBooth force-pushed the fix-detect-module-cjs-exceptions branch from 3fbeb43 to 228f5c7 Compare March 11, 2024 23:10
@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot
Copy link
Collaborator

@GeoffreyBooth GeoffreyBooth added commit-queue Add this label to land a pull request using GitHub Actions. commit-queue-squash Add this label to instruct the Commit Queue to squash all the PR commits into the first one. labels Mar 12, 2024
@nodejs-github-bot nodejs-github-bot removed the commit-queue Add this label to land a pull request using GitHub Actions. label Mar 12, 2024
@nodejs-github-bot nodejs-github-bot merged commit 63d04d4 into nodejs:main Mar 12, 2024
55 checks passed
@nodejs-github-bot
Copy link
Collaborator

Landed in 63d04d4

@GeoffreyBooth GeoffreyBooth deleted the fix-detect-module-cjs-exceptions branch March 12, 2024 14:50
rdw-msft pushed a commit to rdw-msft/node that referenced this pull request Mar 26, 2024
marco-ippolito pushed a commit that referenced this pull request May 2, 2024
marco-ippolito pushed a commit that referenced this pull request May 3, 2024
jcbhmr pushed a commit to jcbhmr/node that referenced this pull request May 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
author ready PRs that have at least one approval, no pending requests for changes, and a CI started. c++ Issues and PRs that require attention from people who are familiar with C++. commit-queue-squash Add this label to instruct the Commit Queue to squash all the PR commits into the first one. esm Issues and PRs related to the ECMAScript Modules implementation. module Issues and PRs related to the module subsystem. needs-ci PRs that need a full CI run. vm Issues and PRs related to the vm subsystem.
Projects
None yet
6 participants