-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make char::DecodeUtf16::size_hist
more precise
#93347
Make char::DecodeUtf16::size_hist
more precise
#93347
Conversation
New implementation takes into account contents of `self.buf` and rounds lower bound up instead of down.
r? @dtolnay (rust-highfive has picked a reviewer for you, use r? to override) |
library/core/src/char/decode.rs
Outdated
// char), or entirely non-surrogates (1 element per char) | ||
(low / 2, high) | ||
|
||
// `self.buf` will never contain the first part of a surrogate, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why? It doesn't seem to me like that's the case.
For example the following would fail the test below.
check(&[0xD800, 0xD800, 0xDC00]);
thread 'char::test_decode_utf16_size_hint' panicked at 'lower = 2, upper = Some(2)', library/core/tests/char.rs:320:13
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was a wrong assumption from the original PR that I haven't checked 😅
I pushed a fix that checks the contents of the buf.
a40122c
to
2c97d10
Compare
`self.buf` can contain a surrogate, but only a leading one.
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
check(&[0xD800, 0xD800, 0x0])
fails your test.
thread 'char::test_decode_utf16_size_hint' panicked at 'lower = 1, count = 2, upper = Some(1)', library/core/tests/char.rs:320:13
There are cases, when data in the buf might or might not be an error.
@dtolnay I fixed this edge case too. I wander if I still missed something 😄 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, looks good.
@bors r+ |
📌 Commit 17cd2cd has been approved by |
…16_size_hint, r=dtolnay Make `char::DecodeUtf16::size_hist` more precise New implementation takes into account contents of `self.buf` and rounds lower bound up instead of down. Fixes rust-lang#88762 Revival of rust-lang#88763
…askrgr Rollup of 8 pull requests Successful merges: - rust-lang#90277 (Improve terminology around "after typeck") - rust-lang#92918 (Allow eliding GATs in expression position) - rust-lang#93039 (Don't suggest inaccessible fields) - rust-lang#93155 (Switch pretty printer to block-based indentation) - rust-lang#93214 (Respect doc(hidden) when suggesting available fields) - rust-lang#93347 (Make `char::DecodeUtf16::size_hist` more precise) - rust-lang#93392 (Clarify documentation on char::MAX) - rust-lang#93444 (Fix some CSS warnings and errors from VS Code) Failed merges: r? `@ghost` `@rustbot` modify labels: rollup
New implementation takes into account contents of
self.buf
and rounds lower bound up instead of down.Fixes #88762
Revival of #88763