Skip to content

Commit

Permalink
Allow seeking within a prefetch stream (#556)
Browse files Browse the repository at this point in the history
* Allow seeking forwards within the prefetch stream

Right now we reset the prefetcher any time it seeks forwards, even if
the distance it's seeking could be handled by inflight requests (in the
worst case, the bytes are already in our buffers, and we just throw them
away). That's expensive and slow!

This change allows us to seek forwards a limited distance into the
prefetch stream. When we see a seek of an acceptable distance, we
fast-forward through the stream to the desired target offset, dropping
the skipped bytes on the floor. We enforce a maximum seek distance,
which is a trade-off between streaming a lot of unnecessary bytes versus
an extra request's latency. I haven't put any careful thought into the
number.

This commit also sets us up to support backwards seeking, which will
come in the future.

Signed-off-by: James Bornholt <[email protected]>

* Allow seeking backwards within a prefetch stream

Linux asynchronous readahead confuses our prefetcher by sometimes making
the stream appear to go backwards, even though the customer is actually
just reading sequentially (#488). The problem is that with parallel FUSE
threads, the two asynchronous read operations can arrive to the
prefetcher out of order.

This change allows us to tolerate a little bit of backwards seeking in a
prefetch stream. We keep around a little bit of previously read data and
can reload it in the event that a seek goes backwards. We do this by
creating a fake new request containing the rewound bytes, so that the
existing read logic will pick them up. I chose an arbitrary max for the
backwards seek buffer, big enough to handle Linux readahead.

This should fix the readahead issue: in my testing, I no longer saw slow
sequential reads, and the logs confirmed this seeking logic was being
triggered in both directions (forwards and backwards), consistent with
the readahead requests sometimes arriving out of order.

Signed-off-by: James Bornholt <[email protected]>

* Fix Shuttle tests with new request size logic

The old test was hiding a bug because it used a hard coded part size of
8MB regardless of what the client used. #552 changed that and now this
test runs out of memory a lot because it degrades to doing 1 byte
requests. I don't think it's worth playing with the logic because it
requires a weird config to get there, so just fix the test.

Signed-off-by: James Bornholt <[email protected]>

---------

Signed-off-by: James Bornholt <[email protected]>
  • Loading branch information
jamesbornholt authored Oct 18, 2023
1 parent a50f1ca commit f58dbc5
Show file tree
Hide file tree
Showing 5 changed files with 421 additions and 28 deletions.
6 changes: 6 additions & 0 deletions mountpoint-s3/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,11 @@
## Unreleased

### Breaking changes
* ...

### Other changes
* Fixed a bug that cause poor performance for sequential reads in some cases ([#488](https://github.com/awslabs/mountpoint-s3/pull/488)). A workaround we have previously shared for this issue (setting the `--max-threads` argument to `1`) is no longer necessary with this fix. ([#556](https://github.com/awslabs/mountpoint-s3/pull/556))

## v1.0.2 (September 22, 2023)

### Breaking changes
Expand Down
Loading

0 comments on commit f58dbc5

Please sign in to comment.