Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

flush() after writing to gzip_file #2753

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

flush() after writing to gzip_file #2753

wants to merge 2 commits into from

Conversation

vin
Copy link

@vin vin commented Nov 15, 2024

Summary

In order to better support streaming responses where the chunks are smaller than the file buffer size, we flush after writing.

Without the explicit flush, the writes are buffered and the subsequent reads see an empty self.gzip_buffer until the file automatically flushes due to either (1) the 32KiB write buffer1 fills or (2) the file is closed because the streaming response is complete.

Without flushing, the GZipMiddleware doesn't work as expected for streaming responses, especially not for Server-Sent Events which are expected to be delivered immediately to clients. The code as written appears to intend to flush immediately rather than buffering, as it does immediately call await self.send(message), but in practice that message is often empty.

Checklist

  • I understand that this PR may be closed in case there was no previous discussion. (This doesn't apply to typos!)
  • I've added a test for each change that was introduced, and I tried as much as possible to make a single atomic change.
  • I've updated the documentation accordingly.

Footnotes

  1. https://github.com/python/cpython/blob/main/Lib/gzip.py#L26

In order to better support streaming responses where the chunks are smaller than the file buffer size, we flush after writing.

Without the explicit flush, the writes are buffered and the subsequent reads see an empty self.gzip_buffer until the file automatically flushes due to either
(1) the write buffer fills, probably at 8kiB, or (2) the file is closed because the streaming response is complete.

Without flushing, the GZipMiddleware doesn't work as expected for streaming responses, especially not for Server-Sent Events which are expected to be delivered immediately to clients. The code as written appears to intend to flush immediately rather than buffering, as it does immediately call `await self.send(message)`, but in practice that `message` is often empty.
@Kludex
Copy link
Member

Kludex commented Nov 15, 2024

Can we add a test to prove your point?

@vin
Copy link
Author

vin commented Nov 15, 2024

I'll work on that today. I have a small repro case I'll share but I need to make it run as a test

@vin
Copy link
Author

vin commented Nov 16, 2024

I've added a test, but it's a bit complicated. Without the flush, the entire contents of the response are correct, but to show that they are received iteratively rather than all at once, I use a wrapping middleware to assert that GZipMiddleware isn't sending empty message bodies, which is what it does without the flush.

@vin
Copy link
Author

vin commented Nov 25, 2024

@Kludex could you take another look or recommend a good reviewer? Thanks!

@josecsotomorales
Copy link

I'm honestly not sure if this is related, but I have a FastAPI project and I'm getting some GZIP exceptions after upgrading to Python 3.13

Exception ignored in: <gzip on 0x7f42bf6a0130>
Traceback (most recent call last):
  File "/usr/local/lib/python3.13/gzip.py", line 359, in close
    fileobj.write(self.compress.flush())
ValueError: I/O operation on closed file.
Exception ignored in: <gzip on 0x7f42bf6a2dd0>
Traceback (most recent call last):
  File "/usr/local/lib/python3.13/gzip.py", line 359, in close
    fileobj.write(self.compress.flush())
ValueError: I/O operation on closed file.
Exception ignored in: <gzip on 0x7f42b40bb460>
Traceback (most recent call last):
  File "/usr/local/lib/python3.13/gzip.py", line 359, in close
    fileobj.write(self.compress.flush())
ValueError: I/O operation on closed file.
Exception ignored in: <gzip on 0x7f42be191de0>
Traceback (most recent call last):
  File "/usr/local/lib/python3.13/gzip.py", line 359, in close
    fileobj.write(self.compress.flush())
ValueError: I/O operation on closed file.
Exception ignored in: <gzip on 0x7fd2c45fd450>
Traceback (most recent call last):
  File "/usr/local/lib/python3.13/gzip.py", line 359, in close
    fileobj.write(self.compress.flush())
ValueError: I/O operation on closed file.
2024-12-04 00:30:04.105 | INFO     | logging:callHandlers:1736 - 127.0.0.1:59939 - "GET /api/ HTTP/1.1" 200

I have the following GZIP wrapper:

import gzip
from typing import Callable

from fastapi import Request, Response
from fastapi.routing import APIRoute


class GzipRequest(Request):
    async def body(self) -> bytes:
        if not hasattr(self, "_body"):
            body = await super().body()
            if "gzip" in self.headers.getlist("Content-Encoding"):
                body = gzip.decompress(body)
            self._body = body  # noqa
        return self._body


class GzipRoute(APIRoute):
    def get_route_handler(self) -> Callable:
        original_route_handler = super().get_route_handler()

        async def custom_route_handler(request: Request) -> Response:
            request = GzipRequest(request.scope, request.receive)
            return await original_route_handler(request)

        return custom_route_handler

@Kludex
Copy link
Member

Kludex commented Dec 5, 2024

Not related.

@vin
Copy link
Author

vin commented Dec 13, 2024

Any concerns here, or how can we best move this forward?

@Kludex
Copy link
Member

Kludex commented Dec 13, 2024

Any concerns here, or how can we best move this forward?

The best way to move forward would be to present the problem first, with an MRE, and references to other issues where other people had the same problem.

I think the current behavior is intentional, so I need to get more references around before reviewing this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants