Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a strict version of JSON encoding #61

Merged
merged 5 commits into from
Nov 16, 2020

Conversation

alusco-scratch
Copy link
Contributor

In some cases when debugging it's nice to have twirp reject JSON
requests that have unknown fields. This adds a new JSON_STRICT encoding
that will enable this behavior. It can be opted into by specifying the
strict parameter when specifying the content type. For curl this looks
like:

curl --request POST \
  --url 'http://localhost:8080/twirp/example.hello_world.HelloWorld/Hello' \
  --header 'Content-Type: application/json; strict=true' \
  --data '{"name": "World", "bob": "fail"}'

In some cases when debugging it's nice to have twirp reject JSON
requests that have unknown fields.  This adds a new JSON_STRICT encoding
that will enable this behavior.  It can be opted into by specifying the
strict parameter when specifying the content type.  For curl this looks
like:

```
curl --request POST \
  --url 'http://localhost:8080/twirp/example.hello_world.HelloWorld/Hello' \
  --header 'Content-Type: application/json; strict=true' \
  --data '{"name": "World", "bob": "fail"}'
```
@alusco-scratch
Copy link
Contributor Author

alusco-scratch commented Nov 6, 2020

Based on discussions in #60, it seemed like there was appetite for this and it is kind of neat/convenient. Happy to close this without committing but wanted to send it out in case we want to merge it.

Copy link
Collaborator

@marioizquierdo marioizquierdo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this idea the more I see it.

At first, I worried that this may only be useful if used by default; the idea is to improve debugging, and adding new options could add more noise to the docs. But in my experience, the curl requests used for debugging are usually copy-pasted from the docs and then modified. If we include strict=true in the examples, that would work well as a default option, while still explicit.

Another thing I worried at first, was that the content type application/json; strict=true is not a valid content type by the spec. And it would fail if sending requests with that content type to other implementations. For example, the Go implementation would return an error with "unexpected Content-Type". But then I realized that is actually good. The message includes exactly what is wrong, so it would be easy to modify the request and try again. After all, the strict=true part of the content-type meant for debugging with manual requests. Besides the content type, the Twirp spec doesn't specify details on how to serialize JSON. As long as it can be parsed/unparsed by the protobuf library in other languages it is fine.

This could be proposed on the Go implementation and maybe made an optional part of the spec. Because this is addressing a common issue where JSON encoding is meant for debugging only, but sometimes it is also used in production, and the requirements for those two scenarios are quite different.

--url http://localhost:8080/twirp/example.hello_world.HelloWorld/Hello \
--header 'Content-Type: application/json; strict=true' \
--data '{"name": "World", "unknown": "field will fail"}'
```
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to include this part in the docs. Instead of that, let's add strict=true on all other examples. If all the examples with curl include strict=true, people will use that already by default. If strict mode is not used by default, then it is not that useful.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just updated the original curl example to include strict=true and removed my note and example

PROTO = "application/protobuf"

class << self

def decode(bytes, msg_class, content_type)
case content_type
when JSON then msg_class.decode_json(bytes, ignore_unknown_fields: true)
when JSON_STRICT then msg_class.decode_json(bytes)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's add ignore_unknown_fields: false explicitly, so it is more clear what is the difference with JSON

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea, done.

@@ -32,6 +37,7 @@ def decode(bytes, msg_class, content_type)
def encode(msg_obj, msg_class, content_type)
case content_type
when JSON then msg_class.encode_json(msg_obj, emit_defaults: true)
when JSON_STRICT then msg_class.encode_json(msg_obj, emit_defaults: true)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when JSON, JSON_STRICT then ... maybe better in the same statement because it is the same?
Granted, it seems that with this option, it would make more sense to use emit_defaults: true only when JSON_STRICT. What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting -- I'm a fan of this though my only concern is to be sure it doesn't make a behavior change. If someone is using JSON mode and we stop emitting defaults then the way ruby parses protos some messages would be nil that were once filled in.

To be fair that's probably not the end of the world but I could see it being somewhat unexpected. Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've thought a bit more on this, this is definitely something that can happen but I don't think there is a huge risk in getting rid of emit_defaults in JSON mode. I will note it'll be super annoying though to hit this bug if you don't have context. Going to make the change but happy to revert it if you change your mind.

Copy link
Contributor Author

@alusco-scratch alusco-scratch Nov 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay I thought about this even more, feels sketchy and confusing. I'd actually really like to get rid of emit_defaults since it's probably costing bytes for people but now if you test via curl and then call you may get confused. Also if you're passing this JSON payload to a website via some JSON API type thing you may have depended on the behavior.

I'm happy to change it since I think it's for the best to get rid of it but do you think it requires a major version bump?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated but want your thoughts per above @marioizquierdo

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It works either way. Twirp actually didn't have emit_defaults until a few months back when v7 was released (neither on Ruby or Go implementations). That was indeed a common point of confusion when debugging, and that's why we changed it to emit_defaults: true. The change is mostly backwards compatible (we didn't see any report of issues because of adding those keys back). It will be fine to change it back to false.

So in one hand it makes sense to keep it as true to be closer to the current Go implementation. But in other hand it is fine to use false by default because that's how it used to behave before, and now we have a clear option with strict: true. So, really up to you, it is fine with me either way.

Copy link
Collaborator

@marioizquierdo marioizquierdo Nov 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

arg, ok, I made a last-minute decision and will change emit_defaults back to true in the regular mode. I see your struggle now 😄 This tricky minor thing heh. I will merge this and then change it in master in a post commit before making the release.

The rationale is that we should stay closer to the Go implementation by default. We should propose using strict=true in Go, and then if that is accepted and the Go implementation decides to change emit_defaults back to false then we can do that in here as well.

@alusco-scratch
Copy link
Contributor Author

I generally agree at first I was like meh but now it's really growing on me lol.

Another thing I worried at first, was that the content type application/json; strict=true is not a valid content type by the spec. And it would fail if sending requests with that content type to other implementations. For example, the Go implementation would return an error with "unexpected Content-Type".

You know what's sorta awesome is that the go implementation won't error since it properly parses content type and drops any parameters. You can see that here

i := strings.Index(header, ";")
	if i == -1 {
		i = len(header)
	}
	switch strings.TrimSpace(strings.ToLower(header[:i]))

Removes my added note and example on strict mode and just updates the
curl example to use it.
This changes it so defaults are only emitted when using the JSON strict
mode.

Note: This will change behavior in subtle ways.  Fields that would have
defaults set are not unset which could result in ruby proto fields
changing to nil values or JSON objects being undefined.
@alusco-scratch
Copy link
Contributor Author

Friendly ping :)

Copy link
Collaborator

@marioizquierdo marioizquierdo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍 Thank you so much for the contribution!

@alusco-scratch
Copy link
Contributor Author

Okay doke, addressed all the feedback. Feel free to merge whenever :)

@marioizquierdo marioizquierdo merged commit c5d5b19 into arthurnn:master Nov 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants