-
Notifications
You must be signed in to change notification settings - Fork 30k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
src: add node:encoding module #45823
Conversation
Review requested:
|
I'm the author of node-iconv and in my experience users care only about two things:
A handful of people also care about:
|
Nice meeting with you @bnoordhuis, and thanks for your work on
Would it make sense to have a |
@anonrig that already exists. See buffer.transcode. I don't think we need a new top level module. If these are added, they should be added to the buffer module. I think of these, counting utf8 bytes is likely the most useful. |
Could this also have a faster equivalent of Buffer.byteLength(string)? Or could that be used to speed that up? |
@anonrig fwiw Ben is a project member for 10+ years with over 2000 commits to core. Or roughly 15 times more commits than you and I combined ^^ As for the change, what's the motivation? performance? |
@mcollina Once we're aligned with the path of this pull request, I'm going to add fast-api calls which will also speed up
@benjamingr In my previous pull request (which is waiting to be reviewed at the moment), I added simdutf which provides performance boost to certain UTF operations. With this pull request, I wanted to enable more operations for all types of input, not specifically for buffer, and expose them in a
@jasnell IMHO, I don't believe that |
87e3335
to
8df4bbf
Compare
I also have concerns about adding a new top level module here. As James mentioned, it's debatable if this functionality warrants a top level module. The very minimal initial API is also a bit of a concern. Furthermore, even with the |
I've made the decision to go with adding these methods to |
The high-speed native-build |
I opened a new pull request for adding buffer.isAscii, therefore I'm closing this pull request. Thank you for all your reviews & time. |
Hmm, not a particular fan of adding even more new features onto |
This is a proposal, and depends on my previous pull request
In summary,
node:encoding
enables unicode validation and transcoding (if needed). We can also go with anode:unicode
name if users are confused with naming.At the moment, I only added the following methods:
TODO:
countUtf8ByteLength()
and replace internal usage ofnode_buffer.cc
byteLengthUtf8
normalize()
for normalizing encodings