Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

doc: add spec for contains module syntax #52059

Merged

Conversation

GeoffreyBooth
Copy link
Member

This PR updates the ESM resolution spec to correct the language about how we do the “contains module syntax” check.

@nodejs/loaders @guybedford

@GeoffreyBooth GeoffreyBooth added doc Issues and PRs related to the documentations. module Issues and PRs related to the module subsystem. esm Issues and PRs related to the ECMAScript Modules implementation. labels Mar 12, 2024
@nodejs-github-bot
Copy link
Collaborator

Review requested:

  • @nodejs/loaders

@GeoffreyBooth GeoffreyBooth added the author ready PRs that have at least one approval, no pending requests for changes, and a CI started. label Mar 13, 2024
doc/api/esm.md Outdated
Comment on lines 1133 to 1139
> * `import` statement (static only, _not_ dynamic `import()`)
> * `export` statement
> * `import.meta`
> * `await` at the top level
> * A lexical redeclaration of any of the CommonJS wrapper variables
> (`require`, `exports`, `module`, `__filename`, `__dirname`) at the top
> level
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
> * `import` statement (static only, _not_ dynamic `import()`)
> * `export` statement
> * `import.meta`
> * `await` at the top level
> * A lexical redeclaration of any of the CommonJS wrapper variables
> (`require`, `exports`, `module`, `__filename`, `__dirname`) at the top
> level
> * A static `import` statement.
> * An `export` statement.
> * Usage of the `import.meta` meta property.
> * An `await` expression at the top level.
> * A lexical redeclaration of any of the CommonJS wrapper variables
> (`require`, `exports`, `module`, `__filename`, `__dirname`) at the top
> level.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's important to clarify that this is only for import statements and not expressions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that [[HasTLA]] is already an ECMAScript module field - https://tc39.es/ecma262/#sec-cyclic-module-records.

@guybedford
Copy link
Contributor

guybedford commented Mar 14, 2024

I worry about tying this detection design to the exact implementation, as opposed to defining what it is checking. Do you think it would be equivalent to claim the detection is the following:

  1. If the source contains await at the top-level, or static import or export syntax, or contains import.meta, return true.
  2. If the source contains a top-level lexical redeclaration (const, let, or class) of any of the CommonJS wrapper variables require, exports, module, __filename, __dirname, then return true.
  3. Return false.

What I prefer about this definition is that we are fully tying it to the exact check rules, as opposed to tying it to all the minutia of JS parser error behaviour.

@GeoffreyBooth
Copy link
Member Author

Do you think it would be equivalent to claim the detection is the following:

Yes, I think that would be equivalent. What steps would you replace with that text?

I also think it should say “static import or export statements (not dynamic import() expressions)” because I think others who might look at this spec to reimplement it, say in bundlers, might not catch the nuance that they should exclude import() unless we spell it out.

@GeoffreyBooth GeoffreyBooth removed the author ready PRs that have at least one approval, no pending requests for changes, and a CI started. label Mar 14, 2024
@aduh95
Copy link
Contributor

aduh95 commented Mar 14, 2024

others who might look at this spec to reimplement it, say in bundlers, might not catch the nuance that they should exclude import() unless we spell it out

I think implementers of all people are the least likely to miss that nuance.

@guybedford
Copy link
Contributor

Yes, I think that would be equivalent. What steps would you replace with that text?

If that is the case, I'd suggest we replace the existing steps with the steps in #52059 (comment) to remove dependence on ECMA-262 parsing error behaviours (which may also change over time).

@guybedford
Copy link
Contributor

So basically, to flip it around to not depend on a CJS parse in future here, and instead ensure the ESM path is the performance optimized one:

> Parse _source_ as an ECMAScript module.
> If the parse is successful,
  > If _source_ contains top-level await, static `import` or `export` syntax or `import.meta`, return *true*.
  > If the source contains a top-level lexical redeclaration (`const`, `let`, or `class`) of any of the CommonJS wrapper variables `require`, `exports`, `module`, `__filename`, `__dirname`, then return *true*.
> Return *false*.

Then we can effectively leave "error reporting" itself as undefined for tooling and bundlers, since every environment can handle their own errors, so we don't need to strictly define that. By allowing that to be a grey area, this should allow us to switch between CJS-first and ESM-first parsing detections I believe? But I may also have missed something so let me know your thoughts.

doc/api/esm.md Outdated
@@ -1124,6 +1124,21 @@ _isImports_, _conditions_)
> 1. Throw an _Invalid Package Configuration_ error.
> 4. Return the parsed JSON source of the file at _pjsonURL_.

**CONTAINS\_MODULE\_SYNTAX**(_source_)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps DETECT_MODULE_SYNTAX to capture that this is the "detection path".

@GeoffreyBooth
Copy link
Member Author

flip it around to not depend on a CJS parse

Well our implementation has to parse as cjs first in order to avoid performance regression. If we define the spec as parsing as esm first then we'll be out of compliance, because ambiguous code that can parse as either would then need to run in strict mode per the spec.

@guybedford
Copy link
Contributor

If we define the spec as parsing as esm first then we'll be out of compliance, because ambiguous code that can parse as either would then need to run in strict mode per the spec.

The compliance says nothing about error handling - only when to treat it as ESM or CJS. Ambiguous code always executes as CJS, regardless of parsing. So where is the non-compliance here?

@GeoffreyBooth
Copy link
Member Author

Ambiguous code always executes as CJS

I guess you’re right; I was thinking of code like console.log(this) but I guess in your version it still runs as CJS. So I guess your version provides the same outcome as our implementation, though it’s the inverse of the actual implementation. What’s the benefit of spec’ing it in this way as opposed to how we’ve actually implemented it?

@guybedford
Copy link
Contributor

What’s the benefit of spec’ing it in this way as opposed to how we’ve actually implemented it?

We instead frame things towards ESM being the optimized parse case, where CJS is an accommodation for the current parsing approach, so that we can optimize ESM as the primary parse in future. We could even document that the reason for the CJS variable handling is to allow for the CJS parsing preference today but that it is designed to work both ways?

@GeoffreyBooth GeoffreyBooth force-pushed the contains-module-syntax-spec branch from f969dfb to 931460f Compare March 17, 2024 22:19
@GeoffreyBooth
Copy link
Member Author

@guybedford I implemented your suggestion, please let me know what you think.

@GeoffreyBooth GeoffreyBooth added commit-queue Add this label to land a pull request using GitHub Actions. commit-queue-squash Add this label to instruct the Commit Queue to squash all the PR commits into the first one. labels Mar 20, 2024
@nodejs-github-bot nodejs-github-bot removed the commit-queue Add this label to land a pull request using GitHub Actions. label Mar 20, 2024
@nodejs-github-bot nodejs-github-bot merged commit f1949ac into nodejs:main Mar 20, 2024
18 checks passed
@nodejs-github-bot
Copy link
Collaborator

Landed in f1949ac

rdw-msft pushed a commit to rdw-msft/node that referenced this pull request Mar 26, 2024
PR-URL: nodejs#52059
Reviewed-By: Luigi Pinca <[email protected]>
Reviewed-By: Guy Bedford <[email protected]>
marco-ippolito pushed a commit that referenced this pull request May 2, 2024
PR-URL: #52059
Reviewed-By: Luigi Pinca <[email protected]>
Reviewed-By: Guy Bedford <[email protected]>
marco-ippolito pushed a commit that referenced this pull request May 3, 2024
PR-URL: #52059
Reviewed-By: Luigi Pinca <[email protected]>
Reviewed-By: Guy Bedford <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
commit-queue-squash Add this label to instruct the Commit Queue to squash all the PR commits into the first one. doc Issues and PRs related to the documentations. esm Issues and PRs related to the ECMAScript Modules implementation. module Issues and PRs related to the module subsystem.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants