-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add lossless debug implementation for unix OsStrs #46798
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
@bors r+ |
📌 Commit 4c5e070 has been approved by |
⌛ Testing commit 4c5e070709686d388281454e1a1384748f5afb01 with merge 286b7b6457b0b923cbc60d0c37b8c814554b6060... |
The new file is going to need a license header.
@bors r- |
💔 Test failed - status-appveyor |
Oops! |
// option. This file may not be copied, modified, or distributed | ||
// except according to those terms. | ||
|
||
#![allow(dead_code)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this still required?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, on windows builds the function is never called.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it work if we do #[cfg(any(unix, target_os = "redox", target_arch = "wasm32"))] pub mod bytestring
instead? I already see some of that for pub mod gnu
and pub mod net
. That way we continue to catch accidental dead code in this file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so - see src\tools\tidy\src\pal.rs
It looks like the goal is to avoid having platform specific code in sys_common
and net
is a temporary exception. The closest existing file to this is wtf8
, which also uses #[allow(dead_code)]
.
@bors r+ |
📌 Commit 8fac7d9 has been approved by |
Add lossless debug implementation for unix OsStrs Fixes #22766 Invalid utf8 byte sequences are replaced with `\xFF` style escape codes, while valid utf8 goes through the normal `Debug` implementation. This is necessarily different from the windows Debug implementation, which uses `\u{xxxx}` style escape sequences for unpaired surrogates, but both implementations are consistent in that they are both lossless, and display invalid sequences in the way most similar to existing language syntax. r? @dtolnay
☀️ Test successful - status-appveyor, status-travis |
Merge `sys_common::bytestring` back into `os_str_bytes` `bytestring` contains code for correctly debug formatting a byte slice (`[u8]`). This functionality is and has historically only been used to provide the debug formatting of byte-based os-strings (on unix etc.). Having this functionality in the separate `bytestring` module was useful in the past to reduce duplication, as [when it was added](rust-lang#46798) `os_str_bytes` was still split into `sys::{unix, redox, wasi, etc.}::os_str`. However, now that is no longer the case, there is not much reason for the `bytestring` functionality to be separate from `os_str_bytes`; I don't think it is very likely that another part of std will need to handle formatting byte strings that are not os-strings in the future (everything should be `utf8`). This is why this PR merges the functionality of `bytestring` directly into the debug implementation in `os_str_bytes`.
Fixes #22766
Invalid utf8 byte sequences are replaced with
\xFF
style escape codes, while valid utf8 goes through the normalDebug
implementation.This is necessarily different from the windows Debug implementation, which uses
\u{xxxx}
style escape sequences for unpaired surrogates, but both implementations are consistent in that they are both lossless, and display invalid sequences in the way most similar to existing language syntax.r? @dtolnay