-
Notifications
You must be signed in to change notification settings - Fork 6.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Opening filenames with Unicode on Windows not possible #3408
Comments
We should probably replace whole |
@yuslepukhin thoughts? |
@maysamyabandeh @yuslepukhin would you accept a patch migrating windows api to unicode types? |
@domenkozar While getting rid of the intermediates like _rmdir is a good idea, you then will have to do the conversion to UTF-16 whereas A entry points do it for you now. The rest of the product before you hit port layer is a single byte encoding thus, you should be able to pass UTF-8 more or less problem free and convert to UTF-16 unless someone somewhere will try to iterate std::string char by char. For all other encodings you will need somehow to pass or autodetect the encoding and then convert to UTF-16 which is not an attractive proposition. |
Changing all the upper types to wchar_t on windows will probably be too much of an upheaval for the codebase. I'm fine with just encoding UTF-8 to UTF-16 before hitting the native Windows API calls. |
@chrisdone Here is what I would do if I were in your shoes for the path of least resistance. Couple of possible problems that may be a non-issue but worth mentioning include
|
@yuslepukhin So would you accept a PR that simply encodes UTF-8 to UTF-16 before hitting the Windows API? Or are you suggesting we maintain a fork of rocksdb with such a change? |
@chrisdone I am suggesting something in the middle. Forks are not desirable in general. Rocksdb allows you to create a custom environment. You can see plenty of examples such as mock_env, chroot env, hdfs.
|
As suggested by @yuslepukhin creating a custom environment is probably a better option here. Please refer to https://github.com/facebook/rocksdb/tree/master/env and https://github.com/facebook/rocksdb/tree/master/hdfs for examples of creating and using custom envs. |
Note for anyone looking this up in the future: This is now fixed in #4469 without having to use a custom env by using the |
@udoprog thanks for the heads-up! |
DB::Open
actually callscreateDirIfMissing
.CreateDir
fromport/win/env_win.cc
._mkdir
, which is fromDirect.h
and has this type:int _mkdir(const char* pathname)
.wchar_t
or equivalent Win32 API name). The bytes passed into_mkdir
are interpreted as the current codepage (CP 437, in my case) as multi-byte characters, as opposed to UTF-16 encoded byte pairs.So I think it would be easier if we could directly expose a function that uses
_wmkdir
which accepts wide-character input. This way you have access to the range that fits in two bytes, rather than the more limited range of CP 437, for example.It's not normal to create a database with non-ASCII, but in our use-case we use absolute paths which may include usernames of people, so that's where the range of unicode support comes in handy.
If I open a PR, would it be considered for inclusion?
The text was updated successfully, but these errors were encountered: