-
Notifications
You must be signed in to change notification settings - Fork 101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Zope 5.8.1 breaks using wsgiref
#1171
Comments
Michael Howitz wrote at 2023-10-10 08:28 -0700:
### What I did:
I am using Python's stdlib `wsgiref` in selenium tests to start a simple WSGI server.
### What I expect to happen:
Using `wsgiref` should be possible.
### What actually happened:
`wsgiref` uses a `socket` as `stdin` for the WSGI environment.
Either you are using `wsgiref` wrongly (likely) or `wsgiref` does not follow
the recommendation of
"https://peps.python.org/pep-3333/#input-and-error-streams" note 1.
(less likely):
HTTP specifies 2 methods to indicate the end of a request:
close the communication stream or specify the request body
length via the `Content-Length` header.
If the communication stream is closed, `read` will return (and not block).
If the request body length is specified via the `Content-Length` header,
the note mentioned above stipulates that the "server" should
emulate an end-of-file condition as soon as that number of bytes have
been read. Again `read` will not block (unless `Content-Length` is wrong).
And Zope 5.8.1+ uses `read()` to get the data from this socket, the execution is blocked and later on runs into a timeout.
Previous versions (via `cgi` module) called `readline()` on the socket which does not block.
We are now using `multipart` to parse `multipart/form-data` reqzest
bodies. `multipart` uses ´read` (taking into account a potential
`Content-Length` header) and not `readline`.
We cannot switch to `readline` without dropping `multipart`.
`ZPublisher.HTTPRequest` follows the `multipart` example and
uses `read` to read from the request input stream.
Unlike `multipart`, it may depend on an end-of-file condition emulation
when `Content-Length` bytes have been read because it does
not look for this header.
(Sorry I cannot provide a minimal test example as it involves too much custom code. Additionally I did not find an easy way to start a Zope instance using `wsgiref` – the examples for using a different WSGI server in the docs seem to require some preparation.)
@icemac
You must provide additional information:
Are you sure that any provided `Content-Length` header value is correct?
Are you sure that the request communication channel is closed
when there is no `Content-Length` header?
For what `Content-Type` do you observe the blocking `read`?
How does your application access the request body in this case?
This information will allow to determine whether the problem is
in `multipart` or `ZPublisher.HTTPRequest`.
Are your requests created in a special way?
I think of the use of "chunk-encoding".
On the `Zope` side, there is no corresponding support.
Correspondingly, problems are to be expected should the server
part (`wsgiref` in your case) does not transform such requests.
|
@d-maurer Thank you for your analysis, I'll try to gather more information. |
Dieter Maurer wrote at 2023-10-11 07:53 +0200:
...
>`wsgiref` uses a `socket` as `stdin` for the WSGI environment.
Either you are using `wsgiref` wrongly (likely) or `wsgiref` does not follow
the recommendation of
"https://peps.python.org/pep-3333/#input-and-error-streams" note 1.
(less likely):
`wsgiref.simple_server.WSGIRequestHandler` does not follow the
recommendation.
@icemac
We could work around this in `Zope` (by wrapping the incoming `wsgi.input`
with something which enforces the recommendation)
but in my view, your test setup is a better place:
use your own request handler (derived from `WSGIRequestHandler`)
which does the wrapping.
|
Dieter Maurer wrote at 2023-10-11 07:53 +0200:
...
HTTP specifies 2 methods to indicate the end of a request:
close the communication stream or specify the request body
length via the `Content-Length` header.
Potentially, the above is not completely right:
if a `multipart/form-data` request is sent without a `Content-Length`
header, then a `readline` based parsing can detect the request body end
(via the closing boundary) while a buffered `read` (as
used by `multipart`) will not detect it realiably and the `read` may block.
The client can shut down its writing end of the communication
channel to indicate that the request has been completely transmitted.
However, apparently, clients typically do not do this
(recently, we decovered that `waitress`, the default server integrated with
`Zope`, does not handle this one sided shutdown correctly --
a strong indication, that is does not happen).
|
Dieter Maurer wrote at 2023-10-11 16:43 +0200:
Dieter Maurer wrote at 2023-10-11 07:53 +0200:
> ...
>HTTP specifies 2 methods to indicate the end of a request:
>close the communication stream or specify the request body
>length via the `Content-Length` header.
See "https://datatracker.ietf.org/doc/html/rfc7230#section-3":
...
If a message body has been
indicated, then it is read as a stream until an amount of octets
equal to the message body length is read or the connection is closed.
...
This indicates that an HTTP agent is not expected to
"parse" the message body (e.g. by indentifying a closing boundary)
to determine its end. The message body length must be derivable from
the "start-line" and the message headers.
In addition, "https://datatracker.ietf.org/doc/html/rfc7230#section-3.3"
is relevant:
...
The presence of a message body in a request is signaled by a
Content-Length or Transfer-Encoding header field.
...
`Zope` does not support `Transfer-Encoding`
(violating "https://datatracker.ietf.org/doc/html/rfc7230#section-3.3.1").
Thus any request with a body processed by `Zope`
must have a `Content-Length` header.
Currently, `Zope` does not check for `Content-Length` to
determine whether a request body is present.
Instead, it assumes its presence for `POST` requests
and delegates proper handling of the request body for all
other cases to the application.
While `multipart` supports the body length specification via
`Content-Length`, `Zope` currently does not pass this information down
to `multipart`, nor does `Zope` currently honour `Content-Length`
at other places in request processing.
Thus, `Zope` currently relies on the WSGI server honouring
the WSGI recommendation to emulate an EOF condition after
`Content-Length` bytes have been read.
I will work on a PR to honour `Content-Length` in case there
are no hints that the WSGI server already has done this.
|
Their may still be a problem with WSGI servers passing the socket to the application: |
#1172 has introduced a workaround but I fear it is not the optimal solution: A better solution would be the implementation of the wrapping in a WSGI middleware. Integrators combining |
We decided that |
What I did:
I am using Python's stdlib
wsgiref
in selenium tests to start a simple WSGI server.What I expect to happen:
Using
wsgiref
should be possible.What actually happened:
wsgiref
uses asocket
asstdin
for the WSGI environment.And Zope 5.8.1+ uses
read()
to get the data from this socket, the execution is blocked and later on runs into a timeout.Previous versions (via
cgi
module) calledreadline()
on the socket which does not block.(Sorry I cannot provide a minimal test example as it involves too much custom code. Additionally I did not find an easy way to start a Zope instance using
wsgiref
– the examples for using a different WSGI server in the docs seem to require some preparation.)What version of Python and Zope/Addons I am using:
The text was updated successfully, but these errors were encountered: