Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Append to string buffer when calling read_line with string buffer #60

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

oysols
Copy link

@oysols oysols commented Oct 1, 2022

This matches implementation of futures-rs and behavior of BufRead::read_line.
This should also fix issue of panic and "drops 6 characters" mentioned in #47.

@fogti
Copy link
Member

fogti commented Nov 6, 2022

wouldn't it be more appropriate to do this in

futures-lite/src/io.rs

Lines 1697 to 1710 in a28ac5b

match String::from_utf8(mem::take(bytes)) {
Ok(s) => {
debug_assert!(buf.is_empty());
debug_assert_eq!(*read, 0);
*buf = s;
Poll::Ready(ret)
}
Err(_) => Poll::Ready(ret.and_then(|_| {
Err(Error::new(
ErrorKind::InvalidData,
"stream did not contain valid UTF-8",
))
})),
}

and replace that segment by

    match core::str::from_utf8(&bytes[..]) {
        Ok(s) => {
            // idk what the following line wants to accomplish exactly...
            debug_assert_eq!(*read, 0);
            *buf += s;
            bytes.clear();
            Poll::Ready(ret)
        }
        Err(_) => Poll::Ready(ret.and_then(|_| {
            Err(Error::new(
                ErrorKind::InvalidData,
                "stream did not contain valid UTF-8",
            ))
        })),
    }

this gets rid of the requirement that the buffer needs to be empty initially, and avoids an unnecessary string allocation in case the buffer already has content. on the other hand, it has the disadvantage that in the arguably common case that the buffer is empty, it does an unnecessary allocation (a few times for bytes and once (this is added) for buf);
the approach of this PR does have that disadvantage in the "alternative case" that the buffer is already filled (it discards the originally allocated buffer, and duplicates it into bytes).
Optimizing for the "alternative case" might be a good idea if someone wants to append to a buffer which is already filled with a few MiB of string data, because we avoid re-doing the UTF-8 validation of the original buffer contents in that case. Deciding between these approaches probably requires benchmarking both.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants