Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(Windows) FS can not handle certain characters in file name #48673

Closed
ZEDCWT opened this issue Jul 6, 2023 · 22 comments · Fixed by libuv/libuv#4092
Closed

(Windows) FS can not handle certain characters in file name #48673

ZEDCWT opened this issue Jul 6, 2023 · 22 comments · Fixed by libuv/libuv#4092
Labels
fs Issues and PRs related to the fs subsystem / file system. libuv Issues and PRs related to the libuv dependency or the uv binding. windows Issues and PRs related to the Windows platform.

Comments

@ZEDCWT
Copy link

ZEDCWT commented Jul 6, 2023

Version

v20.4.0

Platform

Microsoft Windows NT 10.0.19045.0 x64

Subsystem

No response

What steps will reproduce the bug?

Run the following script

require('fs').writeFileSync('\uD83D\uDD79\uFE0F.log','Test')

Where \uD83D\uDD79\uFE0F is 🕹️

How often does it reproduce? Is there a required condition?

Always failed for v20.4.0, v20.3.1 is okay

What is the expected behavior? Why is that the expected behavior?

Run without errors

What do you see instead?

~

Additional information

Tried on linux, no errors

@atlowChemi atlowChemi added the fs Issues and PRs related to the fs subsystem / file system. label Jul 6, 2023
@bnoordhuis
Copy link
Member

Libuv recently switched to WTF-8, to deal with surrogate pairs (0xD83D 0xDD79 is a surrogate pair) in Windows file paths. That change landed in v20.4.0.

cc @vtjnash

@vtjnash
Copy link
Contributor

vtjnash commented Jul 6, 2023

\uD83D\uDD79\uFE0F is not a legal string, since it incorrectly encodes for a surrogate, not a unicode character. The Joystick emoji encoding is:

U+1F579 ️ U+FE0F

@vtjnash
Copy link
Contributor

vtjnash commented Jul 6, 2023

We could potentially disable the UTF-8 transcoder error checking again though, and permit this to access the file anyways. The name would not round-trip correctly through the file-system, but that is not too significantly different from how case-sensitivity can be lost during the conversion to accessing the file system.

@ZEDCWT
Copy link
Author

ZEDCWT commented Jul 7, 2023

\uD83D\uDD79\uFE0F is not a legal string, since it incorrectly encodes for a surrogate, not a unicode character. The Joystick emoji encoding is:

U+1F579 ️ U+FE0F

For what reason you think it is not valid?

'\u{1F579}'==='\uD83D\uDD79' // true

This is JavaScript, they are identical.

@vtjnash
Copy link
Contributor

vtjnash commented Jul 7, 2023

Ah, interesting. That means node is explicitly converting the valid string in ecmascript's UCS-2 into an invalid UTF-8 string, possibly intentionally since ecmascript does not actually support this emoji and instead defines that it is composed of 2 unknown characters from the surrogate block. We could certainly change libuv to be a non-validating UTF-8 decoder so that it can still accept invalid strings as input.

https://mathiasbynens.be/notes/javascript-encoding

vtjnash added a commit to vtjnash/libuv that referenced this issue Jul 12, 2023
We forgot to mask off the high bits from the first byte, so we ended up
always failing the subsequent range check.

Refs: libuv#297
Fixes: nodejs/node#48673
@rvagg
Copy link
Member

rvagg commented Aug 7, 2023

This maybe shouldn't be closed until we get a libuv release then a Node release that includes it? We're going to collect a lot of dupes for this and they should be closed and point here until the release is done. #48813 #49042.

(Speaking from experience, I'm dealing with some failing windows CI tests since 20.4.0 https://github.com/ipld/codec-fixtures/actions/runs/5756528910/job/15606085523 for files that have ... complicated ... names).

@rvagg rvagg reopened this Aug 7, 2023
@rvagg rvagg changed the title FS can not handle certain characters in file name (Windows) FS can not handle certain characters in file name Aug 7, 2023
@user7230724
Copy link

LTS is also affected. Works in v18.0.0, but not in v18.18.0.

@bnoordhuis
Copy link
Member

It's fixed upstream in libuv/libuv@d09441c but not released yet.

arcanis added a commit to yarnpkg/berry that referenced this issue Sep 28, 2023
**What's the problem this PR addresses?**

Node.js 20.4 broke using emojis in file names:
nodejs/node#48673

**How did you fix it?**

Disabling the code that relies on it (it's just a sanity check anyway).

**Checklist**
<!--- Don't worry if you miss something, chores are automatically
tested. -->
<!--- This checklist exists to help you remember doing the chores when
you submit a PR. -->
<!--- Put an `x` in all the boxes that apply. -->
- [x] I have read the [Contributing
Guide](https://yarnpkg.com/advanced/contributing).

<!-- See
https://yarnpkg.com/advanced/contributing#preparing-your-pr-to-be-released
for more details. -->
<!-- Check with `yarn version check` and fix with `yarn version check
-i` -->
- [x] I have set the packages that need to be released for my changes to
be effective.

<!-- The "Testing chores" workflow validates that your PR follows our
guidelines. -->
<!-- If it doesn't pass, click on it to see details as to what your PR
might be missing. -->
- [x] I will check that all automated PR checks pass before the PR gets
reviewed.
@richardlau richardlau added libuv Issues and PRs related to the libuv dependency or the uv binding. windows Issues and PRs related to the Windows platform. labels Sep 29, 2023
@qianz2
Copy link

qianz2 commented Jan 10, 2024

We're using 20.3.1 version, could we had the fix backported into this version as well?

@AlttiRi
Copy link

AlttiRi commented Feb 17, 2024

It was finally fixed in LTS(5405aa5).


WTF?

import fs from "node:fs/promises";
await fs.writeFile("🚀🔥🛸.txt", "");
node:internal/fs/promises:609
    await binding.openFileHandle(pathModule.toNamespacedPath(path),
                  ^

Error: ENOENT: no such file or directory, open
    at open (node:internal/fs/promises:609:19)
    at Object.writeFile (node:internal/fs/promises:1055:20)
    at file:///C:/Dev/test/x.mjs:2:10
    at ModuleJob.run (node:internal/modules/esm/module_job:218:25)
    at async ModuleLoader.import (node:internal/modules/esm/loader:329:24)
    at async loadESM (node:internal/process/esm_loader:28:7)
    at async handleMainPromise (node:internal/modules/run_main:113:12) {
  errno: -4058,
  code: 'ENOENT',
  syscall: 'open'
}

Node.js v20.11.1

Also, I've seen these errors:

node:internal/fs/utils:557
    stats[0 + offset], stats[1 + offset], stats[2 + offset],
         ^

TypeError: Cannot read properties of undefined (reading '0')
    at getStatsFromBinding (node:internal/fs/utils:557:10)
    at Object.lstat (node:internal/fs/promises:917:10)
    at async file:///C:/Dev/test/x.mjs:3:1
node:internal/fs/promises:915
  const result = await binding.lstat(pathModule.toNamespacedPath(path),
                               ^

Error: ENOENT: no such file or directory, lstat
    at Object.lstat (node:internal/fs/promises:915:32)
    at file:///C:/Downloads/1/x.mjs:3:16
    at ModuleJob.run (node:internal/modules/esm/module_job:218:25)
    at async ModuleLoader.import (node:internal/modules/esm/loader:329:24)
    at async loadESM (node:internal/process/esm_loader:28:7)
    at async handleMainPromise (node:internal/modules/run_main:113:12) {
  errno: -4058,
  code: 'ENOENT',
  syscall: 'lstat'
}

Node.js v20.11.1

It's still not fixed. Even with libuv 1.48.

Even Bun https://github.com/oven-sh/bun/ works fine with it.


UPD.

However, it's fixed in Current (v21.6.2).

But it's not fixed in LTS (v20.11.1) that was released 3 day ago.


> LTS (Long-Term Support). Only verified changes.
> "Recommended For Most Users"
> broken

Are you kidding?


Node.js even have not tests for reading/writing files with surrogate pairs in a filename?

richardlau added a commit to richardlau/node-1 that referenced this issue Mar 1, 2024
Original commit message:

    fs: fix WTF-8 decoding issue (nodejs#4092)

    We forgot to mask off the high bits from the first byte, so we ended up
    always failing the subsequent range check.

    Refs: nodejs#2970
    Fixes: nodejs#48673
richardlau added a commit to richardlau/node-1 that referenced this issue Mar 4, 2024
Original commit message:

    fs: fix WTF-8 decoding issue (nodejs#4092)

    We forgot to mask off the high bits from the first byte, so we ended up
    always failing the subsequent range check.

    Refs: libuv/libuv#2970
    Fixes: nodejs#48673
richardlau added a commit to richardlau/node-1 that referenced this issue Mar 5, 2024
Original commit message:

    fs: fix WTF-8 decoding issue (nodejs#4092)

    We forgot to mask off the high bits from the first byte, so we ended up
    always failing the subsequent range check.

    Refs: libuv/libuv#2970
    Fixes: nodejs#48673
@Rinse12
Copy link

Rinse12 commented Mar 14, 2024

Do we have a release date for this bug fix?

@AlttiRi
Copy link

AlttiRi commented Mar 14, 2024

I assume, in the next LTS release after merging that PR:


It takes more than half a year to add 3 lines of code.

richardlau added a commit that referenced this issue Mar 15, 2024
Original commit message:

    fs: fix WTF-8 decoding issue (#4092)

    We forgot to mask off the high bits from the first byte, so we ended up
    always failing the subsequent range check.

    Refs: libuv/libuv#2970
    Fixes: #48673

PR-URL: #51976
Refs: #48673
Reviewed-By: Rafael Gonzaga <[email protected]>
Reviewed-By: Luigi Pinca <[email protected]>
Reviewed-By: Marco Ippolito <[email protected]>
Reviewed-By: Santiago Gimeno <[email protected]>
Reviewed-By: Ulises Gascón <[email protected]>
richardlau pushed a commit that referenced this issue Mar 15, 2024
PR-URL: #51800
Backport-PR-URL: #51976
Fixes: #51789
Refs: #48673
Reviewed-By: Luigi Pinca <[email protected]>
Reviewed-By: Antoine du Hamel <[email protected]>
Reviewed-By: Yagiz Nizipli <[email protected]>
Reviewed-By: Marco Ippolito <[email protected]>
@AlttiRi
Copy link

AlttiRi commented Mar 30, 2024

Finally, the fix was backported to LTS (v20.12.0).

debadree25 pushed a commit to debadree25/node that referenced this issue Apr 15, 2024
Notable changes:

This release addresses some regressions that appeared in Node.js 18.18.0:

- (Windows) FS can not handle certain characters in file name
  nodejs#48673
- 18 and 20 node images give error - Text file busy (after re-build images)
  nodejs/docker-node#1968
- libuv update in 18.18.0 breaks webpack's thread-loader
  nodejs#49911

The libuv 1.45.0 and 1.46.0 updates that were released in Node.js 18.18.0
have been temporarily reverted.

PR-URL: nodejs#50066
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fs Issues and PRs related to the fs subsystem / file system. libuv Issues and PRs related to the libuv dependency or the uv binding. windows Issues and PRs related to the Windows platform.
Projects
None yet
Development

Successfully merging a pull request may close this issue.