Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Allow custom skip function in Reader #22183

Open
LucasSantos91 opened this issue Dec 7, 2024 · 2 comments
Open

Proposal: Allow custom skip function in Reader #22183

LucasSantos91 opened this issue Dec 7, 2024 · 2 comments

Comments

@LucasSantos91
Copy link
Contributor

This is current implementation of AnyReader.skipBytes.

pub fn skipBytes(self: Self, num_bytes: u64, comptime options: SkipBytesOptions) anyerror!void {
    var buf: [options.buf_size]u8 = undefined;
    var remaining = num_bytes;

    while (remaining > 0) {
        const amt = @min(remaining, options.buf_size);
        try self.readNoEof(buf[0..amt]);
        remaining -= amt;
    }
}

It simply reads into a buffer, and discards it. GenericReader.skipBytes just calls this function.
Many readers can implement skipping efficiently. For file readers, skipping can be done by simply incrementing the file pointer, which may be done without actually calling into the operating system. For the BufferedReader, skipping can by done by simply incrementing the index into the buffer. The current implementation of skipBytes, therefore, leads to unnecessary memory copies and system calls.
I propose we add a user-provided skip function to the Reader interface. We can also provide a default implementation, similar to the current one:

pub fn GenericReader(
    comptime Context: type,

    comptime ReadError: type,
    /// Returns the number of bytes read. It may be less than buffer.len.
    /// If the number of bytes read is 0, it means end of stream.
    /// End of stream is not an error condition.
    comptime readFn: fn (context: Context, buffer: []u8) ReadError!usize,

    comptime SkipError: type,
    /// Returns the number of bytes skipped. It may be less than amount.
    /// If the number of bytes skipped is 0, it means end of stream.
    /// End of stream is not an error condition.
    comptime skipFn: fn(context: Context, amount: usize) SkipError!usize,
) type {
    return struct{
        pub fn skip(self: @This(), amount: usize) SkipError!usize{
            return skipFn(self.ctx, amount);
        }
        // Add the similar conveniences to read: skipAll, skipNoEof.
    };
}

pub fn GenericReaderDefaultSkip(
    comptime Context: type,

    comptime ReadError: type,
    /// Returns the number of bytes read. It may be less than buffer.len.
    /// If the number of bytes read is 0, it means end of stream.
    /// End of stream is not an error condition.
    comptime readFn: fn (context: Context, buffer: []u8) ReadError!usize,
) type {
    const Skip = struct{
        pub fn skipBytes(context: Context, num_bytes: u64) ReadError!usize{
            var buf: [512]u8 = undefined;
            const amt = @min(num_bytes, buf.len);
            return readFn(context, buf[0..amt]);
        }
    };
    return GenericReader(Context, ReadError, readFn, ReadError, Skip.skipBytes);
}

AnyReader then would get an extra field:

context: *const anyopaque,
readFn: *const fn (context: *const anyopaque, buffer: []u8) anyerror!usize,
skipFn: *const fn (context: *const anyopaque, amount: usize) anyerror!usize,
@kj4tmp
Copy link
Contributor

kj4tmp commented Dec 9, 2024

This functionality is typically added via composition with std.io.SeekableStream. For an example of the implementation for std.fs.File. See

pub fn seekableStream(file: File) SeekableStream {

@LucasSantos91
Copy link
Contributor Author

The problem is that many readers are not fully seekable, but can still skip efficiently. If we were to require a SeekableStream in order to get an efficient skip, we would end up with more complexity, because we would have to pass around one more object, and we would prevent the use of readers that don't have a matching SeekableStream.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants