-
-
Notifications
You must be signed in to change notification settings - Fork 285
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow other plugins to modify ScalaPB types #1007
Comments
Solution 1: Preprocessors as sub-pluginsAdd to ScalaPB an option to load and invoke a class that will preprocess a
[*] This would make ScalaPB's compilerplugin load the code from this artifact (fetch if necessary through coursier), and expect to find a function |
Since |
Good point. Another issue is that updating a given
The function would be called only for files that are in the scope of the preprocessor. DescriptorImplicits will merge the new options at query time (like it does for other auxiliary options). |
Preprocessors should be able to customize https://github.com/scalapb/protoc-bridge/blob/0214bdd17e9e3034d421bc51f38a4b7f34934478/bridge/src/main/scala/protocbridge/ProtocCodeGenerator.scala#L7 (to be able to use cats classes in generated code when |
|
Since the dependency on the preprocessor is only discovered during the invocation of ScalaPB source generator, the SBT project is already set up, so we can't modify its library dependencies. However Solution 2 (coming below) doesn't have this limitation. |
Solution 2: Preprocessors as protoc plugins communicating through protocbridge.In this solution, preprocessors are protoc plugins (not scalapb sub-plugins), that are invoked directly by protoc (and would be listed in PB.targets in SBT). They may generate zero or more files, and in addition, augment their The user's project would have package-scoped option that instruct ScalaPB to look for those updated options:
In this solution preprocessors are JVM protoc plugins, they are loaded into SBT and and can have [*] Tells ScalaPB to look for This solution would work when protoc-bridge is available (at least for ScalaPBC, sbt-protoc, pants, mill), but not when using directly with sbt-protoc since there's no shared protoc-bridge. I expect this to provide sufficient coverage, but in order to cover users who use protoc directly, they would be provided with the following workaround: run protoc once with the preprocessor and a parameter to emit the UpdatedOptions into a binary file. Then, run protoc again and provide the UpdatedOptions file as a generator parameter for ScalaPB. While this solution doesn't satisfy requirement (2) nicely, it's probably acceptable given the expected use. I prefer this Solution over Solution 1 since there is no runtime classloading involved and SBT or other build tools would take care of loading the preprocessor using existing mechanisms. |
I guess ScalaPB would fail if it cannot find
Instead of storing In any case, the in-memory would work for the use-case driving this as we don't use ScalaPB as a raw protoc plugin. I was just challenging the in-memory appaoch because I think it is quite hard for people not familiar with the ecosystem to understand the protoc-bridge lifecycle, so it might make troubleshooting harder.
Agreed. The only tiny downside I see is that a preprocessor protoc plugin needs to be published for all supported platforms if we want it to work with protoc directly (but it looks like the GraalVM build target from ScalaPB is easily reusable) ? |
Solution 3: Preprocessors as declarative transformations defined as package-level optionsThanks to new file/package-level ScalaPB options, ... message ScalaPbOptions {
// ...
// reference packages to inherit preprocessors defined as package-level options
repeated string preprocessors = 22;
message FieldOptionsPreprocessor {
required string if = 1; // i.e. "validate.rules"
oneof textformat_value {
string contains = 2; // i.e. "{ repeated: { min_items: 1 } }"
string is = 3; // i.e. "{ string: { email: true } }"
// maybe a not_contains could be useful?
}
optional scalapb.FieldOptions add_field_options = 4;
optional scalapb.ScalaPbOptions add_file_options = 5; // https://github.com/scalapb/ScalaPB/issues/1007#issuecomment-747637804
}
// processors are evaluated in order, merging options along the way on the matching elements
repeated FieldOptionsPreprocessor field_options_preprocessors = 23;
// same for messages, oneofs, enums, ...
} ... custom transformations could be defined in a package scalapb.validate.cats;
import "scalapb/scalapb.proto";
option (scalapb.options) = {
scope: PACKAGE,
field_options_preprocessors: [
{
if: "validate.rules",
contains: "{ repeated: { min_items: 1 } }",
add_field_options: { type: "cats.data.NonEmptyList" },
add_file_options: { import: "scalapb.validate.cats.NonEmptyListMapper" }
},
{
if: "validate.rules",
contains: "{ repeated: { unique: true } }",
add_field_options: { type: "LinkedHashSet" }
}
]
}; ... that the user should import on the package com.mypackage;
import "scalapb/scalapb.proto";
option (scalapb.options) = {
scope: PACKAGE
preprocessors: ["scalapb.validate.cats"]
}; This is of course less powerful than solution 1 and solution 2 as transformations must be declared using a simple DSL, but simpler as no codegen plugin is involved. I haven't gone through all PGV fields and whether mappings can be declarative, but I am hoping they can with simple Implementation-wise, Parsing textformat seems to be easy, but I am unsure how robust/custom the runtime option parsing ( |
For Solution 2
Yes to both questions. Failing would be the way to know that the expected options are not found. Maven names as convention sounds good. I like the suggestion to communicate through the filesystem. It's also inline with the advice I got on the protobuf mailing list. I'd probably make the paths explicitly provided, and have introduce some convenience in SBT to make it easy to use. I have a bare bones implementation of the Solution 2 locally. It works well, but there's a a chain of promise-future that passes the side output between plugins which could make it non-obvious to debug. For Solution 3It's worth consideration, especially for not requiring a pre-processor. I am a bit concerned that in practice there's going to be more logic required in the preprocessor than this mini-dsl would be able to support. In terms of project execution, I am relying now on Solution 2 (available locally for me) to explore the deeper parts of the project (like non-total fields), import injection, etc. I'll probably be able to evaluate all the mini-DSL would need to do better then. I am more inclined right now to a variant of solution 2 that doesn't use in-memory, but external files, since it would require little to no modification to protoc-bridge. |
One concern I have both for in in-memory and file based communication (that that answer on that mailing list does not address) is whether it's guaranteed that protoc plugin invocations are and remain sequential and in-order. Do you know about that? |
Indeed, time will tell. This might be out of scope of the driving use case, but i can think of a few of our internal use cases where it would be useful to have locally-declared mappings to easily and centrally derive scalapb options from certain custom options - a way to keep the protos annotated with only semanticb custom options without leaking scalapb sugar, yet leveraging it. A bit like auxiliary options but without the ad-hoc part. |
Yes, see https://github.com/protocolbuffers/protobuf/blob/master/src/google/protobuf/compiler/plugin.proto#L168-L169 - however their outputs is written to disk by protoc only at the very end. If we land on a solution where they communicate through the filesystem, the writing (and reading) would be through a side-effect in the plugin. |
Apart from cats, we might in the future implement PGV mappings for refined. In that case, it would be very interesting to capture values for PGV options (and not just trigger on their presence/usage), to use them as literal types. That's not trivial with a protobuf-based DSL. |
protoc-bridge will now create a directory for secondary outputs which plugins can read and write to in order to exchange additional information through a single protoc run. The location of the secondary directory is provided to native protoc plugins through an environment variable SCALAPB_SECONDARY_OUTPUT_DIR passed through protoc. Since JVM based plugins are running the same JVM process as protocbridge, they are not able to see this environment variables. For them, a new message `ExtraEnv` is passed as an unknown option using the ScalaPB field number (1020). See scalapb/ScalaPB#1007
protoc-bridge will now create a directory for secondary outputs which plugins can read and write to in order to exchange additional information through a single protoc run. The location of the secondary directory is provided to native protoc plugins through an environment variable SCALAPB_SECONDARY_OUTPUT_DIR passed through protoc. Since JVM based plugins are running the same JVM process as protocbridge, they are not able to see this environment variables. For them, a new message `ExtraEnv` is passed as an unknown option using the ScalaPB field number (1020). See scalapb/ScalaPB#1007
Solution 4 (currently implemented in preview1)protoc-bridge creates a temporary directory before running protoc that plugins can use to read and write from. protocbridge does not track the access to the directory. The expectation is that plugins creates files that match their class name. ScalaPB gets a new How plugins know where the secondary output directory is? protocbridge passes it to protoc through an environment variable, |
@bjaglin . The preprocessor in scalapb-validate has a mini-dsl inspired by the one you proposed in Solution 3. |
scalapb/scalapb-validate#38 (comment) is very exciting as it really bridges PGV to idiomatic, state-of-the-art Scala! As already mentioned above, I wonder if this DSL couldn't/shouldn't be part of ScalaPB itself. Could https://github.com/scalapb/scalapb-validate/blob/4b49c274f5ab7bc7a67ab8db38499bb9c9fdfb26/core/src/main/protobuf/scalapb/validate.proto#L45 be made more generic while keeping type safety by using To give a concrete use-case (distinct from scalapb/scalapb-validate#38), we are setting a This is just a final thought on this issue, not a blocker to close it (nor scalapb/scalapb-validate#38). |
I looked into a solution where the DSL is part of ScalaPB while investigating this issue. I quickly ran into some technical challenges since the types we match on (pgv) are unknown to ScalaPB and only loaded at runtime through what the user imports. It turns out that working with unknown/dynamic extension fields is a bit tricky and I wasn't able to find a way to get the syntax and type safety we are getting with the current implementation. However, now that we have #38 figured out, I'll be revisiting this idea to see if we can embed field transformations on arbitrary options in ScalaPB. |
Good news - I believe I am close to getting the FieldTransformation DSL implemented in ScalaPB rather than in scalapb-validate. I was able to find the necessary tricks to work with unknown extensions. The benefit, besides simplification of the data flow, is that field transformations would be able to match on arbitrary custom options. If we go there, I will remove the field transformations from scalapb-validate so we have a single implementation. scalapb-validate will retain the capability to push cats and set field transformations through its secondary output. I'll check in the coming days if we can have it within the preview series, so we have a finalized API when we merge. |
Following up here on the now-generic field transformation described in scalapb/scalapb-validate#38 (comment), as it's not blocking scalapb/scalapb-validate#38 but further usage of that supporting feature. A limitation/gotcha in the current implementation/design is that field transformations injected by preprocessors can only work if the field extensions used in the rules ( ScalaPB/compiler-plugin/src/main/scala/scalapb/compiler/FieldTransformations.scala Line 156 in 76642ff
|
Add compiler-plugin tests for field transformation injections For #1007
Commit d443601 improves the error messages and adds unit tests in the compiler plugin for the scenario when injected transformations can't be resolved. I changed in scalapb-validate to make this scenario impossible, explained in scalapb/scalapb-validate#38 (comment). |
Add compiler-plugin tests for field transformation injections For #1007
I ran into what could be considered a limitation with the current DSL design (I am really giving it a hard time by pushing as much as I can ahead of merging the preview, beyond the primary goal of this ticket!). It's impossible to ScalaPB/protobuf/scalapb/scalapb.proto Line 323 in 4003e62
In my case, I was interested in setting PGV options using field transformations (based on custom options in the |
@bjaglin , interesting. If we do this there's a caveat that could be quite confusing. The original PGV generator wouldn't be able to see the tranformed options. Not sure if it's a reason not to do it though. |
I just realized ScalaPB/protobuf/scalapb/scalapb.proto Line 120 in 4003e62
set ends up, no?
|
Yes, just thought of the same. This would require some amount of rework of the internal plumbing field transformations relies on. In order to let #38 reach completion with the suggested interface so we can build this later on, I intend to change ScalaPB/protobuf/scalapb/scalapb.proto Line 323 in 4003e62
google.protobuf.FieldOptions , and make the implementation verify that only ScalaPB options are set.
|
That would be yet another API change, but I am wondering if the |
The use case makes sense. I'll look into getting it in. I wonder if the levels of nesting needed to match on a PGV field (for example, |
I initially thought of suggesting to add another "targeting" field specifically for type and to leave |
Now field_transformations.when matches on FeildDescriptor and field_transformation.set accepts FieldOptions, however it only allows setting ScalaPB options for the time being. For #1007
Now field_transformations.when matches on FeildDescriptor and field_transformation.set accepts FieldOptions, however it only allows setting ScalaPB options for the time being. For #1007
Add compiler-plugin tests for field transformation injections For #1007
Now field_transformations.when matches on FeildDescriptor and field_transformation.set accepts FieldOptions, however it only allows setting ScalaPB options for the time being. For #1007
Some more feedback/thoughts on the API as I continue using the transformation DSL
|
We anticipate it will be shared with other transformation types. See #1007
We anticipate it will be shared with other transformation types. See #1007
I moved |
Note that the |
You are right, I already thought of a potential use-case for an |
We anticipate it will be shared with other transformation types. See #1007
As transformations is a ScalaPB feature that is not constrained to validation, its documentation has been factored out to a self-contained page. For #1007
Closing this issue as the general approach for transformations has been established, and implemented specifically for fields. Transformations for other entities will be worked on through future tickets. |
To implement scalapb/scalapb-validate#38, we want to enable other libraries or plugins to modify the types that ScalaPB will generate. Since protoc plugins have very limited ways to consume what a previous plugin produces (only insertion points are currently supported), we need to come with our own mechanism to plug external type information into ScalaPB.
Requirements:
DescriptorImplicits
should be able to see the updated types, so they generate valid code.The text was updated successfully, but these errors were encountered: