Proposal: Allow custom skip function in `Reader` #22183

LucasSantos91 · 2024-12-07T23:41:53Z

This is current implementation of AnyReader.skipBytes.

pub fn skipBytes(self: Self, num_bytes: u64, comptime options: SkipBytesOptions) anyerror!void {
    var buf: [options.buf_size]u8 = undefined;
    var remaining = num_bytes;

    while (remaining > 0) {
        const amt = @min(remaining, options.buf_size);
        try self.readNoEof(buf[0..amt]);
        remaining -= amt;
    }
}

It simply reads into a buffer, and discards it. GenericReader.skipBytes just calls this function.
Many readers can implement skipping efficiently. For file readers, skipping can be done by simply incrementing the file pointer, which may be done without actually calling into the operating system. For the BufferedReader, skipping can by done by simply incrementing the index into the buffer. The current implementation of skipBytes, therefore, leads to unnecessary memory copies and system calls.
I propose we add a user-provided skip function to the Reader interface. We can also provide a default implementation, similar to the current one:

pub fn GenericReader(
    comptime Context: type,

    comptime ReadError: type,
    /// Returns the number of bytes read. It may be less than buffer.len.
    /// If the number of bytes read is 0, it means end of stream.
    /// End of stream is not an error condition.
    comptime readFn: fn (context: Context, buffer: []u8) ReadError!usize,

    comptime SkipError: type,
    /// Returns the number of bytes skipped. It may be less than amount.
    /// If the number of bytes skipped is 0, it means end of stream.
    /// End of stream is not an error condition.
    comptime skipFn: fn(context: Context, amount: usize) SkipError!usize,
) type {
    return struct{
        pub fn skip(self: @This(), amount: usize) SkipError!usize{
            return skipFn(self.ctx, amount);
        }
        // Add the similar conveniences to read: skipAll, skipNoEof.
    };
}

pub fn GenericReaderDefaultSkip(
    comptime Context: type,

    comptime ReadError: type,
    /// Returns the number of bytes read. It may be less than buffer.len.
    /// If the number of bytes read is 0, it means end of stream.
    /// End of stream is not an error condition.
    comptime readFn: fn (context: Context, buffer: []u8) ReadError!usize,
) type {
    const Skip = struct{
        pub fn skipBytes(context: Context, num_bytes: u64) ReadError!usize{
            var buf: [512]u8 = undefined;
            const amt = @min(num_bytes, buf.len);
            return readFn(context, buf[0..amt]);
        }
    };
    return GenericReader(Context, ReadError, readFn, ReadError, Skip.skipBytes);
}

AnyReader then would get an extra field:

context: *const anyopaque,
readFn: *const fn (context: *const anyopaque, buffer: []u8) anyerror!usize,
skipFn: *const fn (context: *const anyopaque, amount: usize) anyerror!usize,

The text was updated successfully, but these errors were encountered:

kj4tmp · 2024-12-09T03:12:01Z

This functionality is typically added via composition with std.io.SeekableStream. For an example of the implementation for std.fs.File. See

zig/lib/std/fs/File.zig

Line 1605 in a221b2f

pub fn seekableStream(file: File) SeekableStream {

LucasSantos91 · 2024-12-09T17:58:33Z

The problem is that many readers are not fully seekable, but can still skip efficiently. If we were to require a SeekableStream in order to get an efficient skip, we would end up with more complexity, because we would have to pass around one more object, and we would prevent the use of readers that don't have a matching SeekableStream.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: Allow custom skip function in `Reader` #22183

Proposal: Allow custom skip function in `Reader` #22183

LucasSantos91 commented Dec 7, 2024

kj4tmp commented Dec 9, 2024

LucasSantos91 commented Dec 9, 2024

Proposal: Allow custom skip function in Reader #22183

Proposal: Allow custom skip function in Reader #22183

Comments

LucasSantos91 commented Dec 7, 2024

kj4tmp commented Dec 9, 2024

LucasSantos91 commented Dec 9, 2024

Proposal: Allow custom skip function in `Reader` #22183

Proposal: Allow custom skip function in `Reader` #22183