-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: use TextEncoder
and TextDecoder
for utf8 strings
#4513
base: master
Are you sure you want to change the base?
Conversation
master, af1a9c9
seia-soto:textencoder, 1de005e
|
- ~65535 ASCII only characters
> 147.50420889870574 / 156.1296767089117 // benchEngineDeserialization
0.944754463135876
> 147.50420889870574 / 148.7394726802865 // benchEngineSerialization
0.9916951179177841 seia-soto:textencoder, 65764e9
master, de7bfb5
|
packages/adblocker/src/data-view.ts
Outdated
const { written } = TEXT_ENCODER.encodeInto(raw, this.buffer.subarray(this.pos + 4)); | ||
this.pushLength(written); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you do +4
here (32 bits), it's not useful to use pushLength
since you might as well push the length always as uint32. The only reason to use pushLength is to use less bits to encode the length of small strings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, I'm on the fix to support dynamic positioning to utilize pushLength
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@remusao Please, review 0339bdc
(#4513)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixes #4424
This PR replaces punycode encoder and decoder with
TextEncoder
andTextDecoder
for utf8 strings.\ufeff
should be skipped when decoding to ensure the original formUint8Array.subarray
doesn't copy the array but provides a direct interface to subarrayTextEncoder.encodeInto
doesn't produce EOL character ( NULL, U+0000 ) but we don't care becauseTextDecoder
can stop nicely when provided buffer endsTo be safe: