Append to string buffer when calling read_line with string buffer #60

oysols · 2022-10-01T15:12:03Z

This matches implementation of futures-rs and behavior of BufRead::read_line.
This should also fix issue of panic and "drops 6 characters" mentioned in #47.

read_line discards existing string buffer

fogti · 2022-11-06T18:48:49Z

wouldn't it be more appropriate to do this in

futures-lite/src/io.rs

Lines 1697 to 1710 in a28ac5b

    
           match String::from_utf8(mem::take(bytes)) { 
        
               Ok(s) => { 
        
                   debug_assert!(buf.is_empty()); 
        
                   debug_assert_eq!(*read, 0); 
        
                   *buf = s; 
        
                   Poll::Ready(ret) 
        
               } 
        
               Err(_) => Poll::Ready(ret.and_then(|_| { 
        
                   Err(Error::new( 
        
                       ErrorKind::InvalidData, 
        
                       "stream did not contain valid UTF-8", 
        
                   )) 
        
               })), 
        
           }

and replace that segment by

    match core::str::from_utf8(&bytes[..]) {
        Ok(s) => {
            // idk what the following line wants to accomplish exactly...
            debug_assert_eq!(*read, 0);
            *buf += s;
            bytes.clear();
            Poll::Ready(ret)
        }
        Err(_) => Poll::Ready(ret.and_then(|_| {
            Err(Error::new(
                ErrorKind::InvalidData,
                "stream did not contain valid UTF-8",
            ))
        })),
    }

this gets rid of the requirement that the buffer needs to be empty initially, and avoids an unnecessary string allocation in case the buffer already has content. on the other hand, it has the disadvantage that in the arguably common case that the buffer is empty, it does an unnecessary allocation (a few times for bytes and once (this is added) for buf);
the approach of this PR does have that disadvantage in the "alternative case" that the buffer is already filled (it discards the originally allocated buffer, and duplicates it into bytes).
Optimizing for the "alternative case" might be a good idea if someone wants to append to a buffer which is already filled with a few MiB of string data, because we avoid re-doing the UTF-8 validation of the original buffer contents in that case. Deciding between these approaches probably requires benchmarking both.

oysols added 2 commits October 1, 2022 17:03

Add failing test for read_line

12169e5

read_line discards existing string buffer

Fill buffer with content of string buffer

bc1fc99

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Append to string buffer when calling read_line with string buffer #60

Append to string buffer when calling read_line with string buffer #60

oysols commented Oct 1, 2022

fogti commented Nov 6, 2022

Append to string buffer when calling read_line with string buffer #60

Are you sure you want to change the base?

Append to string buffer when calling read_line with string buffer #60

Conversation

oysols commented Oct 1, 2022

fogti commented Nov 6, 2022