Skip to content
This repository has been archived by the owner on Sep 2, 2023. It is now read-only.

MIME types #146

Closed
demurgos opened this issue Jul 9, 2018 · 13 comments
Closed

MIME types #146

demurgos opened this issue Jul 9, 2018 · 13 comments

Comments

@demurgos
Copy link

demurgos commented Jul 9, 2018

Hi,
I want to open an issue dedicated to MIME types. MIME types are mentioned fairly often in the issue #142 but their relevance is not entirely clear for me. Since the other issue is already extremely long, I prefer to open a separate issue.

I would like a clarification about the relation between MIME types, ESM and Node.

Here is what I currently understand, please correct anything wrong and provide more details to clarify the role of MIME types.

  • The authoritative source for MIME types for scripting files is RFC4329
  • There is a pending update for this RFC: ECMAScript Media Types Updates
  • The default MIME type for .js is text/javascript
  • A file loaded by <script src=...> or <script type="module" src=...> can be served with any MIME type, browsers don't care: <script src=...> will use the Script goal, <script type="module" src=...> will use the Module goal.
  • Browser will also ignore the MIME type of transitive dependencies of ESM (static or dynamic import) and treat these dependencies as ESM (even if they are obviously not: example application/wasm)
  • Node uses internally the MIME type application/node for the script file (for which extensions: .js, .mjs, .node?!). Where is it used? Why? Are servers supposed to serve files with this MIME type?
  • The IETF proposal recommends the same MIME type (text/javascript) for both .js and .mjs.
  • The IETF proposal recommends the use of the optional goal parameter for the parse goal. So files with the Module goal can be served using text/javascript;goal=Module, files with the script goal can be served with the text/javascript;goal=Script MIME type.
  • The IETF proposal recommends .mjs to mean text/javascript;goal=Module
  • The browsers don't care about the goal MIME parameter currently.
  • There is a PR so if the goal parameter is present, the browsers must error if the goal used by the browser does not match the goal served by the server.
@bmeck
Copy link
Member

bmeck commented Jul 9, 2018

I generally use MIME to disambiguate the format of a given blob that is to be processed. It corresponds to some sort of grammar/evaluation structure for that format. It is generally obtained through file extension or the HTTP content-type: header.

The default MIME type for .js is text/javascript

Yes and no, it is a MIME type that is registered for .js, but there are several others including application/node (CJS).

A file loaded by <script src=...> or <script type="module" src=...> can be served with any MIME type, browsers don't care: <script src=...> will use the Script goal, <script type="module" src=...> will use the Module goal.

For <script src=...> you can use any MIME and the browser will treat it as a Script. Even if the MIME is application/wasm or application/node.

However, for <script type="module" src=...> this is not true, the browser will use whatever format is defined for type="module" using the MIME; so if you serve it as application/wasm for example it would load the WASM as an ESM, if you serve it with application/octet-stream since there is no browser defined integration it would error during fetching/linking the graph.

Browser will also ignore the MIME type of transitive dependencies of ESM (static or dynamic import) and treat these dependencies as ESM (even if they are obviously not: example application/wasm)

No, it will check the MIME of every dependency and treat them differently according to that.

Node uses internally the MIME type application/node for the script file (for which extensions: .js, .mjs, .node?!). Where is it used? Why?

It doesn't use that string, but I am using the word "MIME" as a means of communicating what format Node is dealing with. In particular, application/node instead of .js or CJS because the term "CJS" is often applied to things outside of the format of a given blob being loaded and .js has multiple formats it could contain.

This application/node MIME applies to the CJS format only per its registration. .node was rejected from official MIME registration because it had ambiguous contents (a thing that I find kind of hilarious personally). We would want to use a vendored MIME if we load files with .node since they are not CJS formatted blobs, but are a C++ module using various shared library file formats.

Per the "why", it is just meaning that some blob being loaded is CJS in format and not something else like JSON.

Are servers supposed to serve files with this MIME type?

They absolutely could, and it would make browsers know that the file is not ESM and they would not execute. It also lets things like loaders/service workers/tools/etc. know the format of the file and process it in w/e fashion they need to (such as compiling to a different format).

The IETF proposal recommends the use of the optional goal parameter for the parse goal. So files with the Module goal can be served using text/javascript;goal=Module, files with the script goal can be served with the text/javascript;goal=Script MIME type.
The IETF proposal recommends .mjs to mean text/javascript;goal=Module
The browsers don't care about the goal MIME parameter currently.
There is a PR so if the goal parameter is present, the browsers must error if the goal used by the browser does not match the goal served by the server.

Yes, but the HTML specification has shown little interest in adding checks so far. This addition would provide a real MIME for the Script goal of JS and means of ensuring something is loaded in the intended format, of which the HTML specification currently doesn't have. It also wouldn't solve the problem for <script src=...> still loading non-JS MIMEs as a Script, so it is a limited solution.

@demurgos
Copy link
Author

demurgos commented Jul 9, 2018

Thank you for your answer, it clarified a few things for me.

The default MIME type for .js is text/javascript

Yes and no, it is a MIME type that is registered for .js, but there are several others including application/node (CJS).

Is it right to go further and say that in absence of any other information (context, parsing), text/javascript is the best fallback MIME type for .js files? (some sort of application/octet-stream for JS)


Thanks for the explanation about <script>, I felt that it was not as simple as what I had written but it was more a lack of knowledge from my side than an ambiguity.

To go further: type="module" means that the browser uses the MIME type to interpret the file. This is also applied to transitive dependencies. application/wasm means "WASM module", text/javascript means "Javascript with the module goal". This means there's no way today in web browser to import a JS file with the Script goal from a JS file with the Module goal because they share the same text/javascript MIME type. Is it right? (At best, you could get a failure using goal=Script if the goal check proposal is implemented.)


It doesn't use that string, but I am using the word "MIME" as a means of communicating what format Node is dealing with. In particular, application/node instead of .js or CJS because the term "CJS" is often applied to things outside of the format of a given blob being loaded and .js has multiple formats it could contain.

Thanks, this is the main piece I was missing. While digging into MIME types this morning I found this issue about registring MIME types for Node and the IANA-approved application/node assignation.
I missed it at first, but it's written there that it's about CommonJS.
A blob with the application/node MIME type contains JS source text expected to be parsed with the Script goal and executed in an environment providing a Node-like CommonJS module system. The Script goal is not mentioned, but it's implied by "Node.js CommonJS execution environments", right?

@bmeck
Copy link
Member

bmeck commented Jul 9, 2018

Is it right to go further and say that in absence of any other information (context, parsing), text/javascript is the best fallback MIME type for .js files? (some sort of application/octet-stream for JS)

Probably can't make any assumption here. Both application/node and text/javascript have significant meanings in different contexts. Neither is "better" since it is just about what the more common one is in a given context. For Node, .js has always meant application/node so it probably would be the more appropriate default. For browsers, they cannot load application/node so it doesn't make sense to serve files with that format to them. That is the problem, we have multiple potential contexts we are talking about shipping these source blogs to and multiple MIMEs for the different contexts with certain backwards compatibility concerns on both sides.

So... it depends on the environments what I would recommend.

This means there's no way today in web browser to import a JS file with the Script goal from a JS file with the Module goal because they share the same text/javascript MIME type. Is it right? (At best, you could get a failure using goal=Script if the goal check proposal is implemented.)

To a browser, Scripts have no associated MIME so it doesn't matter if the web server sends the same MIME as anything else. Effectively, I can't find any browser or web server standard that uses MIMEs and loads using Script.

A blob with the application/node MIME type contains JS source text expected to be parsed with the Script goal and executed in an environment providing a Node-like CommonJS module system. The Script goal is not mentioned, but it's implied by "Node.js CommonJS execution environments", right?

CJS/application/node uses the function body grammar of JS within a sloppy Script goal not the Script goal, that is why things like return can appear in the top scope. It has a few other differences that are being fixed up in ECMA262 like supporting Hashbangs and adding a local variable contour for the various "magic" variables that isn't coming off the global scope.

@GeoffreyBooth
Copy link
Member

To add a bit to what @bmeck wrote, it’s the webservers that choose what MIME type to serve a particular file as, and webservers are free to decide that however they want. Every webserver I’m familiar with chooses MIME types based on file extensions, and they have mappings of what MIME type to choose for each extension, e.g. .js maps to text/javascript. These mappings are usually configurable, though obviously not every user will have sufficient access to change these settings; I can’t configure what MIME types my CDN chooses, for example.

Node choosing how to parse a file, for example deciding if a .js file should be treated as CommonJS or as ESM, is equivalent to the webserver deciding what MIME type to serve for the file: application/node (CommonJS) or text/javascript (ESM). Because no browser supports CommonJS, though, pretty much no webserver will serve .js as application/node, even if sometimes it should (because it might not be executable in a browser context). So webservers generally serve all .js files as text/javascript or application/javascript, which are interchangeable.

The complication for --experimental-modules is that webservers also serve .mjs as text/javascript. So both .js and .mjs are served with the same MIME type on the Web, and therefore webservers aren’t disambiguating between Script (traditional) JavaScript and Module (ESM) JavaScript—you can save your ESM JavaScript file with either a .js or an .mjs extension, and either will work just fine on the Web. Whereas --experimental-modules doesn’t treat these identically—it loads .js as if it were a webserver serving it as application/node, a.k.a. CommonJS, so if your .js file contains import or export statements that file will throw an error.

@ljharb
Copy link
Member

ljharb commented Jul 9, 2018

The webserver also often needs to know, programmatically (as does many other tools), what it is, so that it can add type="module" or not, or lint it, or which AST to parse it to, etc.

@bmeck
Copy link
Member

bmeck commented Jul 9, 2018

@GeoffreyBooth you can setup custom content-type headers in the various CDNs and storage services across a variety of platforms:

Most CDNs also allow uploads directly or through proxying to websites. If they proxy to websites they will use the content-type from the HTTP requests that they send out. So we would mostly be concerned with situations where they are not configurable which primarily is though uploading files through things like .zip files, FTP, etc.

I do think there is a point of pain here, but it isn't major if Node provides some way to load .js files as text/javascript and since we face similar problems if people upload application/node files as text/javascript like today. We haven't seen significant backlash from people about using .js for something that the browser is unable to load with putting application/node files out on servers with the wrong content-type so I do wonder if all of this discussion about the "default" MIME needing to be the same is already disproven.

Node choosing how to parse a file, for example deciding if a .js file should be treated as CommonJS or as ESM, is equivalent to the webserver deciding what MIME type to serve for the file: application/node (CommonJS) or text/javascript (ESM). Because no browser supports CommonJS, though, pretty much no webserver will serve .js as application/node, even if sometimes it should (because it might not be executable in a browser context). So webservers generally serve all .js files as text/javascript or application/javascript, which are interchangeable.

However, application/node and text/javascript are not interchangable. Misidentification of format is exactly what you are claiming we should avoid. Right now we have no disambiguation mechanism that works across platforms, so we should solve that problem instead of endorsing misidentification of format as a reason to throw out static knowledge of a format. I don't see this statement as a point in the argument either for or against matching the web since it seems self contradictory to me.

The complication for --experimental-modules is that webservers also serve .mjs as text/javascript. So both .js and .mjs are served with the same MIME type on the Web, and therefore webservers aren’t disambiguating between Script (traditional) JavaScript and Module (ESM) JavaScript—you can save your ESM JavaScript file with either a .js or an .mjs extension, and either will work just fine on the Web. Whereas --experimental-modules doesn’t treat these identically—it loads .js as if it were a webserver serving it as application/node, a.k.a. CommonJS, so if your .js file contains import or export statements that file will throw an error.

Yes, because we don't have another disambiguation mechanism. That is why things like nodejs/node#18392 are attractive and should be looked into. This statement you make is not a point towards wanting ambiguity and wanting the format of a given file to be unable to be known ahead of time, it is just a statement about needing to keep improving our use case satisfaction by having disambiguation be exposed in such a way that people could use .js as text/javascript.

@demurgos
Copy link
Author

demurgos commented Jul 11, 2018

I am closing this issue because I feel that the MIME types were clarified. You can still comment if you want to.

@Jessidhia
Copy link

I noticed that the IANA draft is still being updated (e.g. https://datatracker.ietf.org/doc/draft-ietf-dispatch-javascript-mjs/ was updated just last week), but the draft makes no mention of the .cjs extension that has been used by --experimental-modules for over a year now. 🤔

@bmeck
Copy link
Member

bmeck commented Nov 6, 2019

@Jessidhia .cjs is CommonJS and would be under application/node not under text/javascript so it wouldn't be in that draft. An update could be done for application/node if desired but would be separate from that draft.

@jkrems
Copy link
Contributor

jkrems commented Nov 6, 2019

I think we were not too concerned about a standards registration for .cjs since CommonJS is de-facto a node internal format and I wouldn't expect things like web servers to actually care about sending the proper mime type.

@SMotaal
Copy link

SMotaal commented Nov 10, 2019

@jkrems I don't disagree in a sense of urgency, but I personally want to urge taking a less cut-and-dry view on such matters.

When in rome so to speak… JavaScript, MIME, those things come with their own concerns for our consideration, and so we are losing sight of those considerations when we resolve to making concrete what can be fairly just a preference-to-not-bother-yet?!

The thing about MIME is that it needs to be distinct today, and tomorrow… and we cannot claim we know how things will look (or how they do already) and so that would be with due respect to the rationale that it is just internal as of yet.

Fair to keep in mind that application/node is hardly uniquely specialized and/or branded enough to compared application/postscript or text/javascript and so with it being watery, we would be aversive to pitfalls from conflicts by applying.

The thing about JavaScript is that it goes places, and one of those is renderers that do consume either formats, and so far that worked without MIME, but no one used MIME back then anyways… this is no longer going to be the case, I mean, if we needed them, they probably will too.

All around, not the right time today, but not cut and dry… fair?

@GeoffreyBooth
Copy link
Member

Just to be clear, I think the only update that would potentially need to be done is to update application/node so that under “File extension(s): .js” we add .cjs. I think I asked about this back when .cjs was first proposed and the response was that we wanted to wait until unflagging, if I remember correctly. So maybe in a week or so we can submit the revision to make that happen.

As far as I'm aware there’s no official registration for file extensions: it goes MIME type ➡️ one or more extensions, but not the other way around.

@GeoffreyBooth
Copy link
Member

Update: https://www.iana.org/assignments/media-types/application/node now includes .cjs.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants