Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for reading symlinks longer than PATH_MAX to readlink and readlinkat #1231

Merged
merged 1 commit into from
May 12, 2020

Conversation

SolraBizna
Copy link
Contributor

@SolraBizna SolraBizna commented May 3, 2020

This is in response to issue #1178.

The new logic uses the following approach.

  • At any time, if readlink returns an error, or a value ≥ 0 and < (not ≤!) the buffer size, we're done.
  • Attempt to readlink into a PATH_MAX sized buffer. (This will almost always succeed, and saves a system call over calling lstat first.)
  • Try to lstat the link. If it succeeds and returns a sane value, allocate the buffer to be that large plus one byte. Otherwise, allocate the buffer to be PATH_MAX.max(128) << 1 bytes.
  • Repeatedly attempt to readlink. Any time its result is ≥ (not >!) the buffer size, double the buffer size and try again.

While testing this, I discovered that ext4 doesn't allow creation of a symlink > 4095 (Linux's PATH_MAX minus one) bytes long. This is in spite of Linux happily allowing paths in other contexts to be longer than this—including on ext4! This was probably instated to avoid breaking programs that assume PATH_MAX will always be enough, but ironically hindered my attempt to test support for not assuming. I tested the code using an artificially small PATH_MAX and (separately) a wired-to-fail lstat. strace showed the code behaving precisely as expected. Unfortunately, I can't add an automatic test for this.

Other changes made by this PR:

  • wrap_readlink_result now calls shrink_to_fit on the buffer before returning, potentially reclaiming kilobytes of memory per call. This could be very important if the returned buffer is long-lived.
  • readlink and readlink_at now both call an inner_readlink function that contains the bulk of the logic, avoiding copy-pasting of code. (This is much more important now that the logic is more than a few lines long.)

Notably, this PR does not add support for systems that don't define PATH_MAX at all. As far as I know, I don't have access to any POSIX-ish OS that doesn't have PATH_MAX, and I suspect it would have other compatibility issues with nix anyway.

Copy link
Member

@asomers asomers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just checked, and I can't create a too-long symlink on FreeBSD, NetBSD, OpenBSD, or Solaris. Is there any operating system that does allow it?

src/fcntl.rs Outdated
@@ -185,28 +185,70 @@ fn wrap_readlink_result(v: &mut Vec<u8>, res: ssize_t) -> Result<OsString> {
Err(err) => Err(err),
Ok(len) => {
unsafe { v.set_len(len as usize) }
v.shrink_to_fit();
Ok(OsString::from_vec(v.to_vec()))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not your fault, but I notice that this line is doing a data copy. Could you eliminate the data copy by taking the v argument by value instead of by reference?

src/fcntl.rs Outdated
unsafe { libc::readlink(cstr.as_ptr(), v.as_mut_ptr() as *mut c_char, v.capacity() as size_t) }
unsafe {
match dirfd {
Some(dirfd) => libc::readlinkat(dirfd, cstr.as_ptr(), v.as_mut_ptr() as *mut c_char, v.capacity() as size_t),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrap the long lines, please.

Copy link
Contributor Author

@SolraBizna SolraBizna May 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this project's wrap width? (I normally wrap to 80 columns, but this project clearly has a wider convention.)

Edit: Apparently the rest of the file is almost all wrapped to 80 columns, and only the code I was editing wasn't. Wrapped all the code I touched to 80 columns.

src/fcntl.rs Outdated
// Uh oh, the result is too long...
// Let's try to ask lstat how many bytes to allocate.
let reported_size = super::sys::stat::lstat(path).and_then(|x| Ok(x.st_size)).unwrap_or(0);
let mut try_size = if reported_size > 0 { reported_size as usize + 1 }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a matter of style, I prefer

if condition {
    something
} else {
   something_else
}

rather than cramming everything onto two lines

// If lstat doesn't cooperate, or reports an error, be a little less precise.
else { (libc::PATH_MAX as usize).max(128) << 1 };
loop {
v.reserve_exact(try_size);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. I think this may be the first good use of reserve_exact I've seen.

src/fcntl.rs Outdated
}
else {
// Ugh! Still not big enough!
let next_size = try_size << 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to gracefully handle symlinks with paths greater than 32 bits. A panic would suffice. You can do

try_size = try_size.checked_mul(2).unwrap();

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to return an error in this case (as below), but if you ask me to I'll replace that logic with a checked multiply.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, I think this code will still panic if the size overflows 32 bits. I think << , like most operators, will panic on overflow in debug builds (but wrap on release builds). If you really want to handle the overflow without panicing, you need to use checked_shl.

src/fcntl.rs Outdated
else { (libc::PATH_MAX as usize).max(128) << 1 };
loop {
v.reserve_exact(try_size);
let res = path.with_nix_path(|cstr| {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These 13 lines are duped. Do you think you could deduplicate them?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the wrong lines got selected. Do you mean L197-L204 and L218-L225?

src/fcntl.rs Outdated
// Uh oh, the result is too long...
// Let's try to ask lstat how many bytes to allocate.
let reported_size = super::sys::stat::lstat(path).and_then(|x| Ok(x.st_size)).unwrap_or(0);
let mut try_size = if reported_size > 0 { reported_size as usize + 1 }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of comparing reported_size to 0, you may as well compare it to PATH_MAX. After all, lstat reports less than what we already tried, then we shouldn't trust lstat's output.

Copy link
Contributor Author

@SolraBizna SolraBizna May 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trusting lstat after a failed readlink handles a race condition where the link size changes between the first readlink and the first lstat.

@SolraBizna
Copy link
Contributor Author

I just checked, and I can't create a too-long symlink on FreeBSD, NetBSD, OpenBSD, or Solaris. Is there any operating system that does allow it?

On Linux, as far as I can tell, it's up to the particular filesystem. None of [ext*, XFS, btrfs] support very long symlinks, though.

SolraBizna added a commit to SolraBizna/nix that referenced this pull request May 3, 2020
@SolraBizna
Copy link
Contributor Author

Thank you for the thorough review. I have implemented every requested change except the panic on absurdly long symlinks. I'll implement that too while I'm "in the code" if you like.

(I do agree that a >4GiB symlink is absolutely, without a doubt useless, but I also subscribe to the GNU convention of avoiding arbitrary limits where possible, even reasonable ones.)

src/fcntl.rs Outdated
@@ -180,33 +180,88 @@ pub fn renameat<P1: ?Sized + NixPath, P2: ?Sized + NixPath>(old_dirfd: Option<Ra
Errno::result(res).map(drop)
}

fn wrap_readlink_result(v: &mut Vec<u8>, res: ssize_t) -> Result<OsString> {
fn wrap_readlink_result(mut v: Vec<u8>, res: ssize_t) -> Result<OsString> {
match Errno::result(res) {
Err(err) => Err(err),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is dead code. We only ever call wrap_readlink_result when res is >= 0, so you may as well eliminate the Errno::result part. I think that issued predated your changes, but since you're here would you please fix it?

// Uh oh, the result is too long...
// Let's try to ask lstat how many bytes to allocate.
let reported_size = super::sys::stat::lstat(path)
.and_then(|x| Ok(x.st_size)).unwrap_or(0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could simply this line into .map_or(0, |x| x.st_size)

src/fcntl.rs Outdated
}
else {
// Ugh! Still not big enough!
let next_size = try_size << 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, I think this code will still panic if the size overflows 32 bits. I think << , like most operators, will panic on overflow in debug builds (but wrap on release builds). If you really want to handle the overflow without panicing, you need to use checked_shl.

@SolraBizna
Copy link
Contributor Author

In fact, I think this code will still panic if the size overflows 32 bits. I think << , like most operators, will panic on overflow in debug builds (but wrap on release builds).

It seems that, barring obviously foolish cases caught by the compiler, bit shifting wraps in both debug mode and release mode. Guess Rust's designers figured that overflow in bit shifting is likely to be intentional. Regardless, I replaced that code with checked_shl just because it's clearer and more idiomatic that way.

Thanks again for the feedback. I've pushed the requested changes. If you think of any other tweaks you'd like me to make, let me know.

@SolraBizna
Copy link
Contributor Author

Huh. Using map_or broke the build on FreeBSD. Is it using an old rustc or something? (map_or seems to have been added in 1.41, which was released this last January.)

@asomers
Copy link
Member

asomers commented May 8, 2020

Huh. Using map_or broke the build on FreeBSD. Is it using an old rustc or something? (map_or seems to have been added in 1.41, which was released this last January.)

Yes. Nix's MSRV is currently 1.36.0. We're pretty conservative with MSRV because we're consumed by platforms that are slow to update, like OpenBSD and embedded systems. Could you please remove map_or? Fie on me for suggesting it.

@SolraBizna
Copy link
Contributor Author

Done, and now all the checks pass again.

@asomers
Copy link
Member

asomers commented May 8, 2020

Great! Now all it needs is a squash.

@SolraBizna
Copy link
Contributor Author

Should I be squashing it on my end? I assumed this would be a case for the "Squash and merge" button.

@asomers
Copy link
Member

asomers commented May 11, 2020

Yes you should squash it. Because we use the bors merge bot, we cannot squash on our end.

@SolraBizna SolraBizna force-pushed the improved_readlink branch from 6fdd54d to dfee424 Compare May 11, 2020 04:49
@SolraBizna
Copy link
Contributor Author

There. Hopefully I did that properly.

Copy link
Member

@asomers asomers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bors r+

@bors
Copy link
Contributor

bors bot commented May 12, 2020

Build succeeded:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants