-
Notifications
You must be signed in to change notification settings - Fork 673
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for reading symlinks longer than PATH_MAX
to readlink
and readlinkat
#1231
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just checked, and I can't create a too-long symlink on FreeBSD, NetBSD, OpenBSD, or Solaris. Is there any operating system that does allow it?
src/fcntl.rs
Outdated
@@ -185,28 +185,70 @@ fn wrap_readlink_result(v: &mut Vec<u8>, res: ssize_t) -> Result<OsString> { | |||
Err(err) => Err(err), | |||
Ok(len) => { | |||
unsafe { v.set_len(len as usize) } | |||
v.shrink_to_fit(); | |||
Ok(OsString::from_vec(v.to_vec())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not your fault, but I notice that this line is doing a data copy. Could you eliminate the data copy by taking the v
argument by value instead of by reference?
src/fcntl.rs
Outdated
unsafe { libc::readlink(cstr.as_ptr(), v.as_mut_ptr() as *mut c_char, v.capacity() as size_t) } | ||
unsafe { | ||
match dirfd { | ||
Some(dirfd) => libc::readlinkat(dirfd, cstr.as_ptr(), v.as_mut_ptr() as *mut c_char, v.capacity() as size_t), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wrap the long lines, please.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this project's wrap width? (I normally wrap to 80 columns, but this project clearly has a wider convention.)
Edit: Apparently the rest of the file is almost all wrapped to 80 columns, and only the code I was editing wasn't. Wrapped all the code I touched to 80 columns.
src/fcntl.rs
Outdated
// Uh oh, the result is too long... | ||
// Let's try to ask lstat how many bytes to allocate. | ||
let reported_size = super::sys::stat::lstat(path).and_then(|x| Ok(x.st_size)).unwrap_or(0); | ||
let mut try_size = if reported_size > 0 { reported_size as usize + 1 } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a matter of style, I prefer
if condition {
something
} else {
something_else
}
rather than cramming everything onto two lines
// If lstat doesn't cooperate, or reports an error, be a little less precise. | ||
else { (libc::PATH_MAX as usize).max(128) << 1 }; | ||
loop { | ||
v.reserve_exact(try_size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. I think this may be the first good use of reserve_exact
I've seen.
src/fcntl.rs
Outdated
} | ||
else { | ||
// Ugh! Still not big enough! | ||
let next_size = try_size << 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need to gracefully handle symlinks with paths greater than 32 bits. A panic would suffice. You can do
try_size = try_size.checked_mul(2).unwrap();
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's better to return an error in this case (as below), but if you ask me to I'll replace that logic with a checked multiply.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact, I think this code will still panic if the size overflows 32 bits. I think <<
, like most operators, will panic on overflow in debug builds (but wrap on release builds). If you really want to handle the overflow without panicing, you need to use checked_shl
.
src/fcntl.rs
Outdated
else { (libc::PATH_MAX as usize).max(128) << 1 }; | ||
loop { | ||
v.reserve_exact(try_size); | ||
let res = path.with_nix_path(|cstr| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These 13 lines are duped. Do you think you could deduplicate them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the wrong lines got selected. Do you mean L197-L204 and L218-L225?
src/fcntl.rs
Outdated
// Uh oh, the result is too long... | ||
// Let's try to ask lstat how many bytes to allocate. | ||
let reported_size = super::sys::stat::lstat(path).and_then(|x| Ok(x.st_size)).unwrap_or(0); | ||
let mut try_size = if reported_size > 0 { reported_size as usize + 1 } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of comparing reported_size
to 0, you may as well compare it to PATH_MAX
. After all, lstat
reports less than what we already tried, then we shouldn't trust lstat
's output.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trusting lstat
after a failed readlink
handles a race condition where the link size changes between the first readlink
and the first lstat
.
On Linux, as far as I can tell, it's up to the particular filesystem. None of [ext*, XFS, btrfs] support very long symlinks, though. |
Thank you for the thorough review. I have implemented every requested change except the panic on absurdly long symlinks. I'll implement that too while I'm "in the code" if you like. (I do agree that a >4GiB symlink is absolutely, without a doubt useless, but I also subscribe to the GNU convention of avoiding arbitrary limits where possible, even reasonable ones.) |
src/fcntl.rs
Outdated
@@ -180,33 +180,88 @@ pub fn renameat<P1: ?Sized + NixPath, P2: ?Sized + NixPath>(old_dirfd: Option<Ra | |||
Errno::result(res).map(drop) | |||
} | |||
|
|||
fn wrap_readlink_result(v: &mut Vec<u8>, res: ssize_t) -> Result<OsString> { | |||
fn wrap_readlink_result(mut v: Vec<u8>, res: ssize_t) -> Result<OsString> { | |||
match Errno::result(res) { | |||
Err(err) => Err(err), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line is dead code. We only ever call wrap_readlink_result
when res
is >= 0, so you may as well eliminate the Errno::result
part. I think that issued predated your changes, but since you're here would you please fix it?
// Uh oh, the result is too long... | ||
// Let's try to ask lstat how many bytes to allocate. | ||
let reported_size = super::sys::stat::lstat(path) | ||
.and_then(|x| Ok(x.st_size)).unwrap_or(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could simply this line into .map_or(0, |x| x.st_size)
src/fcntl.rs
Outdated
} | ||
else { | ||
// Ugh! Still not big enough! | ||
let next_size = try_size << 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact, I think this code will still panic if the size overflows 32 bits. I think <<
, like most operators, will panic on overflow in debug builds (but wrap on release builds). If you really want to handle the overflow without panicing, you need to use checked_shl
.
It seems that, barring obviously foolish cases caught by the compiler, bit shifting wraps in both debug mode and release mode. Guess Rust's designers figured that overflow in bit shifting is likely to be intentional. Regardless, I replaced that code with Thanks again for the feedback. I've pushed the requested changes. If you think of any other tweaks you'd like me to make, let me know. |
Huh. Using |
Yes. Nix's MSRV is currently 1.36.0. We're pretty conservative with MSRV because we're consumed by platforms that are slow to update, like OpenBSD and embedded systems. Could you please remove |
Done, and now all the checks pass again. |
Great! Now all it needs is a squash. |
Should I be squashing it on my end? I assumed this would be a case for the "Squash and merge" button. |
Yes you should squash it. Because we use the bors merge bot, we cannot squash on our end. |
… and `readlinkat`
6fdd54d
to
dfee424
Compare
There. Hopefully I did that properly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bors r+
Build succeeded: |
This is in response to issue #1178.
The new logic uses the following approach.
readlink
returns an error, or a value ≥ 0 and < (not ≤!) the buffer size, we're done.readlink
into aPATH_MAX
sized buffer. (This will almost always succeed, and saves a system call over callinglstat
first.)lstat
the link. If it succeeds and returns a sane value, allocate the buffer to be that large plus one byte. Otherwise, allocate the buffer to bePATH_MAX.max(128) << 1
bytes.readlink
. Any time its result is ≥ (not >!) the buffer size, double the buffer size and try again.While testing this, I discovered that ext4 doesn't allow creation of a symlink > 4095 (Linux's
PATH_MAX
minus one) bytes long. This is in spite of Linux happily allowing paths in other contexts to be longer than this—including on ext4! This was probably instated to avoid breaking programs that assumePATH_MAX
will always be enough, but ironically hindered my attempt to test support for not assuming. I tested the code using an artificially smallPATH_MAX
and (separately) a wired-to-faillstat
.strace
showed the code behaving precisely as expected. Unfortunately, I can't add an automatic test for this.Other changes made by this PR:
wrap_readlink_result
now callsshrink_to_fit
on the buffer before returning, potentially reclaiming kilobytes of memory per call. This could be very important if the returned buffer is long-lived.readlink
andreadlink_at
now both call aninner_readlink
function that contains the bulk of the logic, avoiding copy-pasting of code. (This is much more important now that the logic is more than a few lines long.)Notably, this PR does not add support for systems that don't define
PATH_MAX
at all. As far as I know, I don't have access to any POSIX-ish OS that doesn't havePATH_MAX
, and I suspect it would have other compatibility issues withnix
anyway.