Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set raw_body to None if no HTTP body was sent #506

Merged
merged 3 commits into from
Sep 1, 2017

Conversation

jamesls
Copy link
Member

@jamesls jamesls commented Aug 30, 2017

This also includes an update the local.py so we get
consistent behavior when no HTTP body is provided.

Fixes #503.

@codecov-io
Copy link

codecov-io commented Aug 31, 2017

Codecov Report

Merging #506 into master will decrease coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #506      +/-   ##
==========================================
- Coverage   94.01%   94.01%   -0.01%     
==========================================
  Files          18       18              
  Lines        2941     2939       -2     
  Branches      380      380              
==========================================
- Hits         2765     2763       -2     
  Misses        129      129              
  Partials       47       47
Impacted Files Coverage Δ
chalice/app.py 97.79% <100%> (ø) ⬆️
chalice/local.py 97.6% <100%> (-0.04%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0c2fa06...dfdd367. Read the comment docs.

Copy link
Contributor

@kyleknap kyleknap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementation looks fine. I am curious as to why you chose raw body value as None instead of setting it as an empty string? I read the bug report and it looked like you were contemplating either None or empty string.

@jamesls
Copy link
Member Author

jamesls commented Aug 31, 2017

I don't have a strong preference either way. The two things that swayed me were:

  1. API Gateway set the body key to null if you don't provide a body so it's consistent with the event dict from Lambda.
  2. The default (for the class __init__) has been None so this seemed to be a less risky change. I suppose I could keep the default in the __init__ as None, but then set the default to an empty string if I wanted in the body of the __init__, e.g. if raw_body is None: raw_body = ''

Do you think it should default to an empty string?

@kyleknap
Copy link
Contributor

I was not sure either, which is why I asked. The way I see it is returning a string is nice because you do not have to do any special casing for None in the logic. However, is it possible to send an empty string as the body so that API Gateway set the body as an empty string in the lambda event? Just wondering if it is even possible to directly get an empty string. If so, it may be better to keep it as None. Also do you know other frameworks/libraries default to like flask? That may sway my preference.

@jamesls
Copy link
Member Author

jamesls commented Aug 31, 2017

However, is it possible to send an empty string as the body so that API Gateway set the body as an empty string in the lambda event?

I could not get API Gateway to send the empty string. I think this is the default behavior of the lambda_proxy integration for API gateway

Also do you know other frameworks/libraries default to like flask?

Flask sets it to the empty string (or more specifically an empty bytestring b''). I'm thinking that might sway my preference as well.

@kyleknap
Copy link
Contributor

Given those reasons, I am leaning toward empty string or rather empty bytestring for the raw body. That is also assuming it won't break anyone.

@jamesls
Copy link
Member Author

jamesls commented Aug 31, 2017

Cool, pushed a change to defaults to the empty bytestring.

I don't believe this breaks anyone. The original issue was that sending an empty body and trying to access raw_body resulted in an uncaught exception so there wasn't a default value set when no request body was sent.

@jamesls
Copy link
Member Author

jamesls commented Aug 31, 2017

Interesting...

The failures are due to the fact that to_dict() isn't able to be JSON serialized in python3 because the Request class now has a non json-serializable type (the empty byestring b'').

This is actually an independent problem separate from this change. You can trigger this failure by trying to call to_dict() on the current request if you send any request body and access app.current_request.raw_body before calling app.current_request.to_dict().

It was never my intent to map internal class attributes as part of the app.current_request.to_dict(), and I believe when the method was first written there were no internal attributes. However, the JSON we generate now is:

{
    "_body": null,
    "_is_base64_encoded": false,
    "_json_body": null,
    "_raw_body": null,
    "context": {...},
    "headers": {...},
    "method": "POST",
    "query_params": null,
    "stage_vars": null,
    "uri_params": null
}

Which means my initial thoughts of just remove keys that start with _ could break users. I still think this is the right thing to do because:

  1. The internal attributes will actually vary depending on what you access in the request object, i.e .json_body is computed lazily so whether or not this value is null depends on if you've accessed the property.
  2. You can subtly break your existing app if you're using to_dict() and then add an access to .raw_body in your view function.

What do you all think?

cc @kyleknap @stealthycoin

@kyleknap
Copy link
Contributor

Agreed. I think it is fine to not include the _ keys in the return value of to_dict(). Internal attributes in general should not be included. But I have a couple of follow up questions:

  1. For the second breaking change: You can subtly break your existing app if you're using to_dict() and then add an access to .raw_body in your view function. Can you describe in code how that breaks? It seems strange that public properties would break based on calling a public get action.

  2. Is it a contract that to_dict() will always be JSON serializable? If not, is it just recommended and should we make it part of the contract with this API?

@jamesls
Copy link
Member Author

jamesls commented Aug 31, 2017

  1. Here's an example app:
@app.route('/')
def broken():
    app.current_request.raw_body
    return app.current_request.to_dict()

This app currently crashes. If you omit the line app.current_request.raw_body then the app will not crash. The reason for this is because in the chalice handler code, we construct the dict that lambda needs, so we have to serialize the body to JSON (by default in line 315):

chalice/chalice/app.py

Lines 312 to 323 in f787f3e

def to_dict(self, binary_types=None):
body = self.body
if not isinstance(body, _ANY_STRING):
body = json.dumps(body, default=handle_decimals)
response = {
'headers': self.headers,
'statusCode': self.status_code,
'body': body
}
if binary_types is not None:
self._b64encode_body_if_needed(response, binary_types)
return response

When you access app.current_request.raw_body, this value is also computed lazily in a property, so accessing .raw_body will internally cache this computed value as self._raw_body, and self._raw_body is of type Optional[bytes]:

chalice/chalice/app.py

Lines 278 to 287 in f787f3e

@property
def raw_body(self):
if self._raw_body is None:
if self._is_base64_encoded:
self._raw_body = self._base64decode(self._body)
elif not isinstance(self._body, bytes):
self._raw_body = self._body.encode('utf-8')
else:
self._raw_body = self._body
return self._raw_body

However if you don't access the .raw_body property, then self._raw_body is None, which serializes to null in JSON without issues.

  1. Yes: http://chalice.readthedocs.io/en/latest/api.html#Request.to_dict . We have some limited tests but nothing that checks all the different ways a request object can be constructed. I think I may have found a legitimate use for hypothesis in chalice.

This also includes an update the local.py so we get
consistent behavior when no HTTP body is provided.

Fixes aws#503.
Changed based on the PR feedback.
@kyleknap
Copy link
Contributor

Cool that makes sense to me. I will take a look at the code.

Copy link
Contributor

@kyleknap kyleknap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool. Looks good to me. 🚢. I think hypothesis testing is not necessarily required for this PR, but we should add it in the future.

@jamesls jamesls merged commit dfdd367 into aws:master Sep 1, 2017
@jamesls jamesls mentioned this pull request Sep 11, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants