-
Notifications
You must be signed in to change notification settings - Fork 107
fix(httpjson): handle message derived query params #1784
Conversation
Fixes #1783 Some message derived types such as Duration, FieldMask or Int32Value would not be correctly handled by String.valueOf(). Instead, the toJson() method is used to make it compliant with the protobuf languague guide
public String toQueryParamValue(Object fieldValue) { | ||
if (fieldValue instanceof GeneratedMessageV3) { | ||
return toJson(((GeneratedMessageV3) fieldValue).toBuilder(), false) | ||
.replaceAll("^\"", "") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks suspicious to me, any documentation mentioned that we need to do this manually? What would it looks like if we don't replace them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did this to get the actual contents of a string value. For example, a FieldMask would serialize into "a.b.c,a.b"
, but this replace() will get a.b.c,a.b
without quotes. I will see if there is a better way to ensure sanitization.
* @param fieldValue a field value to serialize | ||
*/ | ||
public String toQueryParamValue(Object fieldValue) { | ||
if (fieldValue instanceof GeneratedMessageV3) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure we want to handle this for all GeneratedMessageV3, this basically means we want to handle any complex types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vam-google I saw that Noah came up with a list of wellknownTypes, do you think this is a full list for supported types in query params?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added the well known types to the logic. The check for GeneratedMessageV3 in toQueryParamValue
will be intersected with serializable message types (e.g. FieldMask) with an extra condition in this if statement - also added a comment line for clarification)
FYI, a similar fix is implemented in go. |
Some message type objects will be passed as query params. These may have nested properties that will now be generated as ?&foo.bar=1&foo.baz=2
6e2dd55
to
0ab76a4
Compare
0ab76a4
to
63e0118
Compare
@@ -51,6 +58,29 @@ | |||
public class ProtoRestSerializer<RequestT extends Message> { | |||
private final TypeRegistry registry; | |||
|
|||
// well-known types obtained from | |||
// https://github.com/googleapis/gapic-showcase/blob/fe414784c18878d704b884348d84c68fd6b87466/util/genrest/resttools/populatefield.go#L27 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I proposed the list from gapic-showcase but I don't think we should just use it, showcase is only used for testing so the list may get out of date. We should probably go through the protobuf doc and add any type that have special handling to the list. For now, since Go already implemented it, we can probably follow what they did here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For sure, I can update the list of well-known types following the docs list, but I'm curious if there is a better way to programatically obtain a list of well-known types so we can update it as a dependency or so.
We have this documentation entry as a source of well-known types but we may have to manually check from time to time to keep our hardcoded list up to date.
However, I noted that the official definitions such as duration.proto would specify a special dotnet package Google.Protobuf.WellKnownTypes
, whereas in java it falls into the common com.google.protobuf
package, leaving us with no reliable programatic way to obtain a list of well-known types that would get updated as a dependency. As a last (and bad) source, there is a unit test in the same folder of the duration definition that contains a list of well-known types, but it is a unit test, so I don't think we have too many options
for (Map.Entry<FieldDescriptor, Object> fieldEntry : | ||
((GeneratedMessageV3) fieldValue).getAllFields().entrySet()) { | ||
Object value = fieldEntry.getValue(); | ||
putQueryParam( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This recursive logic seems complicated and could run into StackOverFlow issue if not careful, also doing all this during runtime could affect the performance for client libraries as well. So now I really think that we should do this traversing all leaf level fields
part in the generator instead of gax.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good
.replaceAll("^\"", "") | ||
.replaceAll("\"$", ""); | ||
} | ||
if (fieldValue instanceof ByteString) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why we need to handle ByteString separately here?
*/ | ||
public String toQueryParamValue(Object fieldValue) { | ||
// This will match with message types that are serializable (e.g. FieldMask) | ||
if (fieldValue instanceof GeneratedMessageV3 && !isNonSerializableMessageValue(fieldValue)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would just use Message
instead of GeneratedMessagev3
in all places since it's more generic. I also found that we could pass an Enum to query params, where Enum is not of Message
type, so we may need to handle enum specifically as well.
// https://github.com/googleapis/gapic-showcase/blob/fe414784c18878d704b884348d84c68fd6b87466/util/genrest/resttools/populatefield.go#L27 | ||
private static final Set<Class<GeneratedMessageV3>> jsonSerializableMessages = | ||
new HashSet( | ||
Arrays.asList( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Effective Java 26: Don't use raw types. To do that and make this immutable at the same time, consider:
private static final Set<Class<? extends GeneratedMessageV3>> JSON_SERIALIZABLE_MESSAGES =
ImmutableSet.of(
BoolValue.class,
Note that the collection could be considered a constant since it is now immutable, thus now worthy of upper snake case.
@@ -127,16 +169,34 @@ public void putPathParam(Map<String, String> fields, String fieldName, Object fi | |||
* @param fieldValue a field value | |||
*/ | |||
public void putQueryParam(Map<String, List<String>> fields, String fieldName, Object fieldValue) { | |||
ImmutableList.Builder<String> paramValueList = ImmutableList.builder(); | |||
ArrayList<String> paramValueList = new ArrayList(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This ArrayList
is a raw-type (so, please at minimum change to use the diamond operator new ArrayList<>()
) and is changing the contract of this public function to modify the fields
output collection to contain a mutable data structure rather than an immutable one. Is this intended, or do we want to keep this immutable?
if (isNonSerializableMessageValue(fieldValue)) { | ||
putDecomposedMessageQueryParam(fields, fieldName, fieldValue); | ||
return; | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When working with the pattern:
if (x is Collection)
foreach i in x
// do A(i)
else
// do A(x)
We can often refactor to:
Iterable<?> fieldValues =
fieldValue instanceof Iterable
? (Iterable<?>) fieldValue
: Collections.singleton(fieldValue);
foreach i in x
// do A(i)
This eliminates the need to study both implementations of A to determine if they're the same or not, with the added benefit of it being easier to test the single code path rather than both. As the complexity of A increases, this simplification becomes more and more beneficial.
private void putDecomposedMessageQueryParam( | ||
Map<String, List<String>> fields, String fieldName, Object fieldValue) { | ||
for (Map.Entry<FieldDescriptor, Object> fieldEntry : | ||
((GeneratedMessageV3) fieldValue).getAllFields().entrySet()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This cast is only safe due to an instanceof check that occurs elsewhere in the logic. Will the next maintainer know they must call isNonSerializableMessageValue()
as a check before this method?
I would highly suggest either performing the check explicitly in this method, or changing the parameter from Object
to GeneratedMessageV3
, thus forcing the caller of this method to perform the cast.
In general, the type-check (instanceof
) and type-assertion (cast
) should always be colocated - as two actions are tightly coupled and will need to change together if one is modified.
paramValueList.add(toQueryParamValue(fieldValueItem)); | ||
} | ||
} | ||
if (hasProcessedMessage) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than introducing this variable and a secondary exit point for the function, consider instead just performing a !paramValueList.isEmpty()
check at the end to decide whether to add to the final fields
map or not.
also fixed best practice issues pointed out in last commit's PR
enums passed as root object to putQueryParam will now be serialized as their numeric value
9095fab
to
08b655b
Compare
Also added tests for serializing objects that contain Any typed messages. Note that the type registry must have the tested types beforehand, so they were added in the test class setup
a668820
to
471a134
Compare
expectedFields.put( | ||
"object.options.value.value", Arrays.asList("1.000000001s", "2.000000002s", "a.b.c,d.e.f")); | ||
// used by JSON parser to obtain descriptors from this type url | ||
expectedFields.put( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an interesting edge case, I don't think we should include these type info in the query params, we probably don't want to configure an Any field as query params as well. Can you do a little more investigation on this?
// tests with Any type messages require corresponding descriptors in the type registry | ||
requestSerializer = | ||
ProtoRestSerializer.create( | ||
TypeRegistry.newBuilder() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you check if these descriptors are added to ProtoRestSerializer on generating client libraries? If not, we should added the well knowntypes to the registry as well.
Kudos, SonarCloud Quality Gate passed! |
The final approach was to rely on JSON serialization to produce fields to be included in the query paramenters. This allows well-known types such as Duration and Timestamp to be serialized into a string with compliant format, plus allowing complex message types to be serialized as tree/array objects that would be recursively processed and added as query parameters. As next steps, we must handle enum types as root of the input of |
Fixes #1783
Some message derived types such as Duration, FieldMask or Int32Value
would not be correctly handled by String.valueOf(). Instead, the
toJson() method is used to make it compliant with the protobuf language
guide