Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GR-60338] [Native Image] sun.jnu.encoding defaults to legacy code page on Windows #10237

Open
1 task done
fmeum opened this issue Dec 5, 2024 · 1 comment
Open
1 task done
Assignees

Comments

@fmeum
Copy link

fmeum commented Dec 5, 2024

Describe the Issue

The Charset used for filesystem operations (e.g. to encode/decode paths) is determined by the sun.jnu.encoding system property, which is inherited from the build time environment. On Windows, this defaults to the current code page (as determined by a call to GetACP()), which usually results in a legacy encoding such as Cp1252 that doesn't support most Unicode characters. The only way to work around this is to force the system code page to be UTF-8, which is still considered a beta feature and can break other apps.

Using the latest version of GraalVM can resolve many issues.

GraalVM Version

Oracle GraalVM 23.0.1+11.1

Operating System and Version

Windows 10 x86

Build Command

Any image build

Expected Behavior

System.getProperty("sun.jnu.encoding") returns UTF-8 by default or there is a build-time option to choose this encoding.

Actual Behavior

System.getProperty("sun.jnu.encoding") returns Cp1252 and operating on filesystem paths with non-ASCII Unicode characters results in InvalidPathExceptions.

Steps to Reproduce

  1. Print System.getProperty("sun.jnu.encoding") from any image build on Windows with default settings.

Additional Context

Individual applications can opt into a UTF-8 code page: https://learn.microsoft.com/en-us/windows/apps/design/globalizing/use-utf8-code-page#set-a-process-code-page-to-utf-8

It may be sufficient to add such a manifest to the native-image.exe binary.

Build Log Output and Error Messages

No response

@selhagani selhagani self-assigned this Dec 5, 2024
@selhagani selhagani changed the title [Native Image] sun.jnu.encoding defaults to legacy code page on Windows [GR-60338] [Native Image] sun.jnu.encoding defaults to legacy code page on Windows Dec 5, 2024
@selhagani
Copy link
Member

Hi @fmeum,
Thank you for reaching out to us!
We'll take a look into this and I'll make sure to keep you updated.

@selhagani selhagani assigned pejovica and unassigned selhagani Dec 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants