Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

grpc-netty-shaded cannot be successfully compiled under GraalVM #10601

Closed
zenghu1chen opened this issue Oct 9, 2023 · 21 comments
Closed

grpc-netty-shaded cannot be successfully compiled under GraalVM #10601

zenghu1chen opened this issue Oct 9, 2023 · 21 comments

Comments

@zenghu1chen
Copy link

zenghu1chen commented Oct 9, 2023

What version of gRPC-Java are you using?

grpc-netty-shaded-1.58.0

What is your environment?

  • operating system: Both Ubuntu 11.3.0 and Windows 10
  • jdk: GraalVM CE 22.3.3 (build 17.0.8+7-jvmci-22.3-b22)

What did you expect to see?

successfully compiled under GraalVM

What did you see instead?

compiled failed under GraalVM

Steps to reproduce the bug

  1. clone the repo https://github.com/zenghu1chen/graalvm-grpc and execute cd graalvm-grpc
  2. execute mvn clean -Pnative native:compile

My project has been recently trying to use GraalVM. Due to the use of the grpc-netty-shaded package, the compilation has been unable to pass. I only recently found out about this issue.
My current temporary solution is to delete the META-INF/native-image folder. It seems to work fine, but I'm not sure if it's appropriate.

@ejona86
Copy link
Member

ejona86 commented Oct 9, 2023

You don't mention what error you get, and I haven't had a moment yet to try the reproduction, but this sounds likely to be related to this rewriting logic.

class NettyResourceTransformer implements Transformer {

It is interesting that it works better without that rewriting than with. We used to throw away the netty graalvm config.

@zenghu1chen
Copy link
Author

Here is the error I got

Error: Classes that should be initialized at run time got initialized during image building:
 org.slf4j.LoggerFactory was unintentionally initialized at build time. To see why org.slf4j.LoggerFactory got initialized use --trace-class-initialization=org.slf4j.LoggerFactory
To see how the classes got initialized, use --trace-class-initialization=org.slf4j.LoggerFactory
Error: Use -H:+ReportExceptionStackTraces to print stacktrace of underlying exception

This seems to indicate the introduction of inappropriate build-time initialization configurations, such as --initialize-at-run-time=.

@sanjaypujare
Copy link
Contributor

...
This seems to indicate the introduction of inappropriate build-time initialization configurations, such as --initialize-at-run-time=.

So this is in the build config/script you have created for GraalVM, right? Is there anything to be done/fixed in the grpc java repo?

@zenghu1chen
Copy link
Author

--initialize-at-run-time= and other configurations are not created by me, but exist in the native-image.properties configuration file under the META-INF/native-image folder in the grpc-netty-shaded package.
Currently, I think it is possible to consider deleting the META-INF/native-image folder in the code, at least this way it can be compiled and run successfully.

@Override
boolean canTransformResource(FileTreeElement fileTreeElement) {
fileTreeElement.name.startsWith("META-INF/native-image/io.netty")
}
@Override
void transform(TransformerContext context) {
String updatedPath = context.path.replace("io.netty", "io.grpc.netty.shaded.io.netty")
String updatedContent = context.is.getText().replace("io.netty", "io.grpc.netty.shaded.io.netty")
resources.put(updatedPath, updatedContent)
}

@sanjaypujare
Copy link
Contributor

In terms of changes to the NettyResourceTransformer class in build.gradle are you suggesting not changing anything under "META-INF/native-image/" ?

@zenghu1chen
Copy link
Author

My suggestion is to delete the META-INF/native-image/ folder. Specifically, the original code is

@Override
void transform(TransformerContext context) {
String updatedPath = context.path.replace("io.netty", "io.grpc.netty.shaded.io.netty")
String updatedContent = context.is.getText().replace("io.netty", "io.grpc.netty.shaded.io.netty")
resources.put(updatedPath, updatedContent)
}

the modified code is

    @Override
    void transform(TransformerContext context) {
        // not put any resource for these folder
    }

@sanjaypujare
Copy link
Contributor

I am not sure that's the right change. In any case feel free to send a PR and that can be reviewed.

@zenghu1chen
Copy link
Author

Here is the PR #10607

@larry-safran larry-safran added this to the Next milestone Nov 8, 2023
@VjekoslavKrainovic
Copy link

Is there any update on this?
I am using Google Firebase-Admin library which has GRPC library in pom.xml

This is error i am getting:

[INFO] [creator] Jan 07, 2024 9:07:06 PM com.google.api.gax.nativeimage.NativeImageUtils registerClassForReflection [INFO] [creator] WARNING: Failed to find io.grpc.netty.shaded.io.netty.channel.ProtocolNegotiators on the classpath for reflection.

@zenghu1chen
Copy link
Author

There hasn't been any progress for now, and I haven't had time recently to study the actual principles.

For this situation, you can try adding io.grpc.netty.shaded.io.netty.channel.ProtocolNegotiators to the reflection configuration file reflect-config.json.You can also open an issue to Firebase-Admin.

@abuijze
Copy link

abuijze commented Jan 11, 2024

I've bumped into this issue recently and managed to find what's causing the problems.

The problem
The compile-time problems are caused by the native-image.properties files that are copied (and transformed) from Netty. Specifically these files for the netty-codec-http and netty-codec-http2 libraries contain an instruction to initialize classes in the io.netty package (which is transformed to io.grpc.netty.shaded.io.netty) at build-time rather than runtime.

Additionally, the GraalVM reachability metadata repository contains entries for netty-common and netty-transport, which are not taken into consideration when using grpc-netty-shaded. The entries in the GraalVM metadata repository overlap with the entries in the modules themselves, but don't fully match.

The workaround
I've managed to get things fully working by performing the following steps:

  • manually modify the grpc-netty-shaded-1.60.1.jar and removing all references to --initialize-at-build-time=io.netty. They exist in 2 places: the transformed files from netty-codec-http and netty-codec-http2.
  • Take the native-image reachability metadata from the GraalVM Reachability Metadata Repository for netty-transport and netty-common and copy those into the META-INF/native-image of my project.

The solution
The --initialize-at-build-time=io.netty entries in netty-codec-http and netty-codec-http2 are most likely there for a reason. However, the jars only contain classes in the io.netty.codec.http and 'io.netty.codec.http2' packages, respectively. So instead of removing the entries entirely, they could be rewritten to --initialize-at-build-time=io.grpc.netty.shaded.io.netty.codec.http and similar for http2.

The reachability metadata entries in the GraalVM Reachability Metadata repository could be included and transformed for the shaded jar.

Or alternatively, open a PR to include similar entries for the grpc-netty-shaded jar in the Metadata Repository.

@ejona86
Copy link
Member

ejona86 commented Jan 11, 2024

Recent related: googleapis/google-cloud-java#10180

@VjekoslavKrainovic, your problem sounds different. 1) That looks to be a warning, so might could be ignored. 2) That looks to be a gax problem; that class does not exist.

@abuijze, your findings look really promising.

The --initialize-at-build-time=io.netty...

Sounds like that should be an upstream Netty change. But yeah, that does seem it should be io.netty.codec.http and similar.

The reachability metadata entries...

I don't think it makes sense to copy them to grpc-netty-shaded from that repository. But it could make sense to upstream them to netty.

That metadata repository scares me some, because why do things like this netty-buffer config exist? Because override=true, that just disables Netty's configuration?

@julienvincent
Copy link

I'm also running into this, very interested in a solution.

@abuijze Have you managed to automate this somehow? Specifically the first step:

  • manually modify the grpc-netty-shaded-1.60.1.jar and removing all references to ...

I am wondering if this is possible to incorporate into a build system.

And when copying the reachability metadata did you need to modify any of the package namespaces to include .shaded.?

@abuijze
Copy link

abuijze commented Jan 13, 2024

I've done done some more digging to find out why exactly things were working previously and not now. There's a few dots where I have to guess the connection, but otherwise I've managed to connect most of the dots now.

Sounds like that should be an upstream Netty change

@ejona86 , I agree with you. I don't fully understand why it worked in separate jars and not the shaded jar. It seems that the configuration only affects classes in that specific jar. On the other hand, that doesn't seem to make sense, as the property provided is a build property and should affect the entire build. But after all, yes, netty should change the properties to potential conflicting configuration. I'll look into an issue there and maybe a PR. Should be fairly simple.

I don't think it makes sense to copy them to grpc-netty-shaded from that repository

Ideally, that shouldn't be necessary. In fact, if you read the prerequisites of filing a PR in the GraalVM Metadata repository, they explicitly mention that projects should attempt to include the right reachability metadata themselves first. What really confuses me is that the netty modules contain reachability metadata, but apparently it's not sufficient. I wonder if that has to do with reachability requirements that netty doesn't maybe have itself, but which just other libraries rely on.

@julienvincent , I haven't come to automating this yet. I did it manually at first, just to find out what works and what not. I made it half a study project too, to dive a bit further in the inner workings of native compilation.

I have managed to get things working entirely with the netty-shaded jar (I used version 1.60.1). I put the reachibility metadata classes on the classpath, including a native-image.properties file that override the erroneous settings in the netty modules. Here is the gist with the files I ended up with

And when copying the reachability metadata did you need to modify any of the package namespaces to include .shaded.?

Yes, that was the main reason I had to copy the files. The ones in the repo provide metadata for the original classes, but the shaded classes contain a package prefix. So these shaded classes would need the same reachability meta data to get things working.

@abuijze
Copy link

abuijze commented Jan 15, 2024

I've found more time to understand why things were happening the way they did and the difference in behavior with regular Netty dependencies versus the shaded dependency.

My demo app is a Spring Boot app that contains both regular Netty (used by Spring Boot) as well as grpc-netty-shaded for the Axon Server Connector. When running the build, I noticed that the Spring Boot build triggers a native build with a lot of --exclude-config parameters. It excluded all the Netty native configurations and replaced them with their configuration.

I've compared their configuration with the one I shared in the gist in the previous comment. The difference was interesting: only the package prefixes are different. It seems that they replace the internal Netty configuration with the one published in the GraalVM Reachability Meta Data Repository.

All in all, I think @zenghu1chen 's comment might very well be the right way to go. Having no native configuration is better than having a faulty one, as excluding configuration from modules is not an easy thing to do.

At Netty, they are reluctant to change their configuration, as they are afraid of breaking compatibility with components that may have things working. They have scheduled the fix for Netty 5.

@julienvincent
Copy link

julienvincent commented Jan 15, 2024

@abuijze Thanks for providing the gist, I was able to successfully get it compiling with that config. I am running into some further issues when trying to run the compiled binary though:

java.lang.NoSuchMethodException: io.grpc.netty.shaded.io.netty.channel.socket.nio.NioSocketChannel.<init>()

Have you run into this as well?

I guess this might just require adding additional reachability metadata but I wonder why this would be required if using the (altered) official reachability metadata as source.

@abuijze
Copy link

abuijze commented Jan 15, 2024

No, I didn't run into that one. I can see several entries for that constructor in the reflect_config.json file. Perhaps none of the "conditions" match in your case?

@ejona86
Copy link
Member

ejona86 commented Jan 16, 2024

At Netty, they are reluctant to change their configuration, as they are afraid of breaking compatibility with components that may have things working.

@abuijze, can you link to that discussion?

@lohhonglee
Copy link

@julienvincent, I have created a new GraalVM example example-graalvm (in my1.61.x branch) with automated removal of problematic native configuration in grpc-netty-shaded. See if that works for you. It is currently done for Gradle build only. If someone can help complete the Maven and Bazel builds, it can potentially be a PR.

What I have done is created a custom Gradle plugin that

  1. Downloads the grpc-netty-shaded JAR
  2. Modifies the JAR by removing META-INF/native-image folder
  3. Creates a local Maven repository that contains the modified JAR

And the build script will

  1. Make the build search for grpc-netty-shaded JAR in that repository
  2. Provide reachability metadata for grpc-netty-shaded (reusing abuijze's gist) as configuration files to the compilation.

(Ideally, the retrieval of configuration files should be automated to fetch from GraalVM Reachability Metadata repository)

@abuijze, I would also like to know the link to the discussion at Netty regarding their reluctance to fix it in Netty 4.

@abuijze
Copy link

abuijze commented Jan 25, 2024

Had to do some digging in my browser history, but I found the discussion: netty/netty#12601

@ejona86
Copy link
Member

ejona86 commented Mar 20, 2024

We don't have the expertise for this, and we are just caught in the middle. If a recommendation came out from googleapis/google-cloud-java#10180 we'd be interested in doing our part. But there doesn't seem much for us to do. It sounds also like there are missing configs in netty that would improve the situation while still allowing ahead-of-time class initialization. I agree it is pretty nasty that Netty's configs are broken, such that no config works better, but that's really beyond us. If Netty rejects it, then Netty config override should be copied in GraalVM Reachability Metadata to grpc-netty-shaded. That is a much more reliable change that us having a third configuration in grpc.

Although Netty rejected the previous change the config, I think that's because the change was too aggressive. A change that preserves ahead-of-time initialization would probably be accepted more readily.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
8 participants