Skip to content

Commit

Permalink
Merge #1231
Browse files Browse the repository at this point in the history
1231: Add support for reading symlinks longer than `PATH_MAX` to `readlink` and `readlinkat` r=asomers a=SolraBizna

This is in response to issue #1178.

The new logic uses the following approach.

- At any time, if `readlink` returns an error, or a value ≥ 0 and < (not ≤!) the buffer size, we're done.
- Attempt to `readlink` into a `PATH_MAX` sized buffer. (This will almost always succeed, and saves a system call over calling `lstat` first.)
- Try to `lstat` the link. If it succeeds and returns a sane value, allocate the buffer to be that large plus one byte. Otherwise, allocate the buffer to be `PATH_MAX.max(128) << 1` bytes.
- Repeatedly attempt to `readlink`. Any time its result is ≥ (not >!) the buffer size, double the buffer size and try again.

While testing this, I discovered that ext4 doesn't allow creation of a symlink > 4095 (Linux's `PATH_MAX` minus one) bytes long. This is in spite of Linux happily allowing paths in other contexts to be longer than this—including on ext4! This was probably instated to avoid breaking programs that assume `PATH_MAX` will always be enough, but ironically hindered my attempt to test support for *not* assuming. I tested the code using an artificially small `PATH_MAX` and (separately) a wired-to-fail `lstat`. `strace` showed the code behaving precisely as expected. Unfortunately, I can't add an automatic test for this.

Other changes made by this PR:
- `wrap_readlink_result` now calls `shrink_to_fit` on the buffer before returning, potentially reclaiming kilobytes of memory per call. This could be very important if the returned buffer is long-lived.
- `readlink` and `readlink_at` now both call an `inner_readlink` function that contains the bulk of the logic, avoiding copy-pasting of code. (This is much more important now that the logic is more than a few lines long.)

Notably, this PR does *not* add support for systems that don't define `PATH_MAX` at all. As far as I know, I don't have access to any POSIX-ish OS that doesn't have `PATH_MAX`, and I suspect it would have other compatibility issues with `nix` anyway.

Co-authored-by: Solra Bizna <[email protected]>
  • Loading branch information
bors[bot] and SolraBizna authored May 12, 2020
2 parents 095b5be + dfee424 commit 33f4efe
Show file tree
Hide file tree
Showing 2 changed files with 67 additions and 19 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@ This project adheres to [Semantic Versioning](http://semver.org/).
when a group or user lookup required a buffer larger than
16KB. (#[1198](https://github.com/nix-rust/nix/pull/1198))
- Fixed unaligned casting of `cmsg_data` to `af_alg_iv` (#[1206](https://github.com/nix-rust/nix/pull/1206))
- Fixed `readlink`/`readlinkat` when reading symlinks longer than `PATH_MAX` (#[1231](https://github.com/nix-rust/nix/pull/1231))

### Removed

Expand Down
85 changes: 66 additions & 19 deletions src/fcntl.rs
Original file line number Diff line number Diff line change
Expand Up @@ -180,33 +180,80 @@ pub fn renameat<P1: ?Sized + NixPath, P2: ?Sized + NixPath>(old_dirfd: Option<Ra
Errno::result(res).map(drop)
}

fn wrap_readlink_result(v: &mut Vec<u8>, res: ssize_t) -> Result<OsString> {
match Errno::result(res) {
Err(err) => Err(err),
Ok(len) => {
unsafe { v.set_len(len as usize) }
Ok(OsString::from_vec(v.to_vec()))
fn wrap_readlink_result(mut v: Vec<u8>, len: ssize_t) -> Result<OsString> {
unsafe { v.set_len(len as usize) }
v.shrink_to_fit();
Ok(OsString::from_vec(v.to_vec()))
}

fn readlink_maybe_at<P: ?Sized + NixPath>(dirfd: Option<RawFd>, path: &P,
v: &mut Vec<u8>)
-> Result<libc::ssize_t> {
path.with_nix_path(|cstr| {
unsafe {
match dirfd {
Some(dirfd) => libc::readlinkat(dirfd, cstr.as_ptr(),
v.as_mut_ptr() as *mut c_char,
v.capacity() as size_t),
None => libc::readlink(cstr.as_ptr(),
v.as_mut_ptr() as *mut c_char,
v.capacity() as size_t),
}
}
}
})
}

pub fn readlink<P: ?Sized + NixPath>(path: &P) -> Result<OsString> {
fn inner_readlink<P: ?Sized + NixPath>(dirfd: Option<RawFd>, path: &P)
-> Result<OsString> {
let mut v = Vec::with_capacity(libc::PATH_MAX as usize);
let res = path.with_nix_path(|cstr| {
unsafe { libc::readlink(cstr.as_ptr(), v.as_mut_ptr() as *mut c_char, v.capacity() as size_t) }
})?;

wrap_readlink_result(&mut v, res)
// simple case: result is strictly less than `PATH_MAX`
let res = readlink_maybe_at(dirfd, path, &mut v)?;
let len = Errno::result(res)?;
debug_assert!(len >= 0);
if (len as usize) < v.capacity() {
return wrap_readlink_result(v, res);
}
// Uh oh, the result is too long...
// Let's try to ask lstat how many bytes to allocate.
let reported_size = super::sys::stat::lstat(path)
.and_then(|x| Ok(x.st_size)).unwrap_or(0);
let mut try_size = if reported_size > 0 {
// Note: even if `lstat`'s apparently valid answer turns out to be
// wrong, we will still read the full symlink no matter what.
reported_size as usize + 1
} else {
// If lstat doesn't cooperate, or reports an error, be a little less
// precise.
(libc::PATH_MAX as usize).max(128) << 1
};
loop {
v.reserve_exact(try_size);
let res = readlink_maybe_at(dirfd, path, &mut v)?;
let len = Errno::result(res)?;
debug_assert!(len >= 0);
if (len as usize) < v.capacity() {
break wrap_readlink_result(v, res);
}
else {
// Ugh! Still not big enough!
match try_size.checked_shl(1) {
Some(next_size) => try_size = next_size,
// It's absurd that this would happen, but handle it sanely
// anyway.
None => break Err(super::Error::Sys(Errno::ENAMETOOLONG))
}
}
}
}

pub fn readlink<P: ?Sized + NixPath>(path: &P) -> Result<OsString> {
inner_readlink(None, path)
}

pub fn readlinkat<P: ?Sized + NixPath>(dirfd: RawFd, path: &P) -> Result<OsString> {
let mut v = Vec::with_capacity(libc::PATH_MAX as usize);
let res = path.with_nix_path(|cstr| {
unsafe { libc::readlinkat(dirfd, cstr.as_ptr(), v.as_mut_ptr() as *mut c_char, v.capacity() as size_t) }
})?;

wrap_readlink_result(&mut v, res)
pub fn readlinkat<P: ?Sized + NixPath>(dirfd: RawFd, path: &P)
-> Result<OsString> {
inner_readlink(Some(dirfd), path)
}

/// Computes the raw fd consumed by a function of the form `*at`.
Expand Down

0 comments on commit 33f4efe

Please sign in to comment.