-
Notifications
You must be signed in to change notification settings - Fork 182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Access to default methods in @JImplements
#1182
Conversation
Will be hard to get this to work as the jar file must be loaded into the base classloader. It may be doable, but will require serious effort. |
sorry I cannot estimate if it'd be worth the effort? How common are these "default methods"? |
Default methods are a newer feature of Java in which the programmers adds some functionality to an interface which will automatically be available to the implemented classes. They are common in newer code, but most of the Java library classes currently don't use them. They are difficult to access because they didn't exist when JNI was written so there is no "call the default" method and the proxy system works against them. The issue here is that the newer java security methods prevent calling code from arbitrary modules, and our jar is not located in a position of any privilege as we are not in the boot loader. About the only way I can make this happen would be to make my own Jar loader which uses the JNI special under the table class definition method which in theory would get the JPype into a lower level. However, JNI doesn't really have any support for modules or module definition so I have no idea if I can actually get to the level required to make this type of access. Making a jar side loader would require access to gzip and tar access. There may be severe technical issues if I need to load the class files in a particular order. One can expect that many libraries will have default method if they use interfaces at some point. Thus there is some value, but it is a large lift (and given I am underwater until July it won't happen soon.) |
Bypassing JVMs security mechanisms does not seem right. If these extended Interfaces do not posses a mechanism in JNI, we should request them upstream. It does not make any sense for them to be inacessible, right? Why are default methods not already handled by the MRO, e.g. calling the parents implementation? Actually it should, should it not? |
The problem is more complex than that. The issue you cited is regarding calling dynamics from JNI without the presence of a proxy.
See this posting which is more relevant…
https://stackoverflow.com/questions/37812393/how-to-explicitly-invoke-default-method-from-a-dynamic-proxy
We do see default methods in the MRO and we can try to invoke them. The issue occurs because our methods are a proxy. All methods called on a proxy, the proxy then redirects to Python or back to Java. But because of the way that defaults are implemented that just goes back to the proxy again. For normal interface method this isn’t a problem as there is nothing to call back to so we just throw UnsupportedOperation exception and the guard code prevents the user from doing this when a JProxy was declared. We can’t redirect it back to Java through the JNI because defaults don’t have an invoke method other than the one that routes to the proxy. Java can redirect to the method, but because of the new security model you must be in the same module (or be a privileged class) to access the MethodHandle. Thus the catch 22. There is no security violation to call a default method (it is public) but there is no way to access the method outside of using a secure path.
This means the only way to implement this feature is to place org.jpype in a privileged state such that it can call a public method (and be trusted to check that). This is a typical problem with the patching in of new features without considering the native interface. We faced the same issue when they started checking the module called from, without considering that JNI calls have no module. The correct solution would be to have JNI register itself to have a module presence that use that module to determine the access.
Solutions….
1. Use the JNI loadClass interface to load our jar into the bootstrap loader such that we live at the same level as core libraries. Of course this has the downside that calls through our jar may be able to access private methods introducing security flaws (through the total number of paths is small.).
2. Use the JNI loadClass interface to load only the Proxy code required to make the request. If the call checks for public access and is bullet proof, then it is better than placing our whole jar file at a privileged level.
3. Try to convert our entire infrastructure to a module and then set the module privilege flags when calling the start JVM to elevate our module. Same issue as 1.
4. Abandon old versions of the JDK…. (Please pick me! Pick me!) These problems all stem from poorly thought through patches. They eventually got reported and thus added support methods that solve the issue. The problem is those support methods don’t exist and thus I have to go through very large hoops to make us work in only versions. In any sane world JDK 1.8 would long since have hit the dust bin and only the latest LTS would need to be dealt with. As each is retired we could remove the kludges.
All of these are a lift because it means we can’t insert jar files through the typical DynamicClassLoader (which places us at a lower that typical security level) but instead must use an entirely different loading scheme.
I should note that this patch works perfectly if we simply place the org.jpype.jar on the classpath of jpype.startJVM(). The difference is that when we enter at the classpath we get loaded by the System loader rather than by the Dynamic loader. To have truly elevated privileges we have to load our classes in the bootloader.
|
I am still trying to come up with a solution for this. The Java9 security model is the source of the issues. Certain actions require that the code be loaded into the lowest layer of the JVM and be contained in an authorized module. There was no thought as to how the JNI layer interacts as it doesn't have a module nor is formally loaded into the ClassLoader other than by class which invoked it first. A lot of stuff we have yet to be able to support such as calling default methods, calling protected methods/accessing protected members when extending a class are all stuck on this same common element. Without appropriate privileges all calls end up as second class and are denied. I have tried a number of different ways to work around this:
I did some testing on the javaagent option. As far as I can tell noone outside of my group makes use of agents in the Looking deeper I think that agent may solve the problem though in a weird way. You can start multiple agents with the repeated arguments so it isn't like java.class.path. Second it appears that if we simply make org.jpype.jar an agent (even if it doesn't do anything) it will automatically be promoted to a level were we can access reflection freely. The problem with the JVM classpath security model is that assumes that everything must be known at start of JVM and thus the only jars that can be trusted exist at that time. Many jar types that provide services just don't work after the JVM is started. I have not found any API which allows the JNI controlling process which spawned the JVM to control the access privileges. The entire model of the JVM is "one and done" for any and all settings. I am guessing that is either laziness on the part of JVM developers (this is edge case stuff and we have better things to do), neglect (which JNI is clearly suffering from), or some lame attempt at security from back when the JVM was still running in browsers and people were sandbox breaking. Regardless of the case, it gives us the classic "sucks to be you" feeling as seeming reasonable requests just can't be met. It makes me long for that far future in which graalvm or web assembly VM simply provides a language independent platform which Python and Java can play nicely. |
@marscher I think this one may be working now. But we need other PR to get test matrix back on line first. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As this PR is a lot to chew on, I have highlighted the critical sections.
return MethodHandles.lookup() | ||
.unreflectSpecial(method, method.getDeclaringClass()) | ||
.bindTo(proxy) | ||
.invokeWithArguments(args); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the heart of the change as we need to call these privileged methods to invoke a special method. Of course given the method we are trying to invoke is nothing more than a public method implemented in the interface there is absolutely nothing privileged about the call. The issue is simply that the API through which we are accessing a public method can also be used to invoke private methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't using MethodHandles.publicLookup be better here since it can only access public fields/methods thus removing the privileged issue?
I'm sure there are some possible benefits to having a java agent, I just wanted to point this out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can certainly try that. Though we will still need the agent if we want to make sure that it never changes behavior based on how JPype is started. (Something that plagues the development.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other reason for requiring privilege is if we ever want to be able to "extend" a class in Python. In that case we will need to get to protected methods, and protected fields.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other reason for requiring privilege is if we ever want to be able to "extend" a class in Python. In that case we will need to get to protected methods, and protected fields.
I actually have a prototype using JDK22 ClassFile API preview feature, as an independent python package, where I managed this. I designed it around the impression that if you want to extend a Java class from Python, then you don't care about the added overhead 😅.
I basically created delegate methods that start with "-" which could only be accessed by reflection or it would be a syntax error. It's gross but it worked.
I was considering integrating and contributing it since it would not introduce GPL code. Was just waiting on it to exit preview in a LTS JDK and then for the free time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should be able to call a default method from jni using CallNonvirtualObjectMethodA
if you use the interface with the default method as the claz parameter.
I figured this out while working on super()
support for extension classes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CallNonvirtualObjectMethodA does not work. I ran tests on it. While it does allow the default method to be called, it ALWAYS calls the default even when overwritten. I thus went with this approach as there are no methods in the JNI interface that gave proper behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CallNonvirtualObjectMethodA does not work. I ran tests on it. While it does allow the default method to be called, it ALWAYS calls the default even when overwritten. I thus went with this approach as there are no methods in the JNI interface that gave proper behavior.
Yes that is what the NonVirtual does, it's invoke exactly the specified method. Thinking about it, I'm not sure why the regular CallVirtual doesn't work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason it doesn't is that for interfaces "ALL" virtual method route to the proxy dispatch. As default methods are virtual that means they effectively are not reachable other than by the "NonVirtual" method. We could in principle your the virtual call request to our method, then call to Python with a reference handle find out that it was really supposed to go to default, then reroute back in the JNI method for a NonVirtual call. That would mean unpacking and repacking arguments at C level and keeping track of the default level in C. Of course we would have to know to try the nonvirtual path (trying to call it when it doesn't exist is a segfault). So that means everything needs to be handed to the the C method. I attempted that path but found it distasteful as it required making reflection API calls prior to launching to C.
It is "doable" but means more routing on the primary path to check for and passing the default method down to C. That would make this already slow path potentially even slower for a feature that not many people will likely use. Better to reroute back to Java and try to invoke the handle there.
I just wanted to put this option out here InvocationHandler.invokeDefault was introduced in JDK 16. You could access the method through reflection if using JDK 16 or newer and either throw an exception otherwise, or investigate how it's achieved in the Proxy class and see if it's manageable to do in JDK 8. If I remember correctly you can open a module at runtime, assuming you don't just slap on the --add-opens when starting the JVM (I've not looked at how jpype does this). That should get you access to whatever Java internals you're trying to get at, although I don't think it's the best solution. I understand that there needs to be support for JDK 8, but if you're using a 10 year old version of the JDK, that's on you. Things for newer JDK versions can be accessed via reflection with a negligible amount of additional overhead. If a user wants to use these features but can't, then maybe they should stop using |
Regarding illegal reflective access. For the LTS JDKs it works as follows:
So for this specific instance, you can use the JDK16+ method when available and use illegal reflective access otherwise. There is a table with access information here. I couldn't figure out how to link to the section but it's a little more then halfway down the page. |
Good to look into. I put in a patch that made JPype load as an Agent. That may allow us to get certain accesses that may otherwise be prohibited. The current issue with JPype is that we are side loading in a class loader post JVM startup. As an agent we can in theory change the byte code of anything started including permissions.
|
What accesses are you trying to make that would be prohibited? I just finished looking at the changes and didn't see any access that would be denied. I'm assuming there is no intent to support anything using the flawed My initial concern with the java agent was that maybe I wouldn't be able to attach a profiler or debugger if I wanted. However, a quick Google search has shown me that you can specify multiple |
Currently the Proxy code can’t access methods that were marked as default in interfaces. The only way to get those is to reflect with a protected call even through they are public methods. There are other restrictions when using a class loader other than the system loader. Basically JPype is being treated as untrusted internet downloaded code because we are using a modified URL loader. In such an untrusted position we only have full access to classes that were loaded after the JVM is started. Those on the classpath have higher privileges that us (unless you specifically place org.jpype.jar on the classpath.) If we become an agent we are loaded under the JVM which ensures that we have at least the same access as other jars. We have slightly better than that as we get to inspect any incoming Java class that is loaded. We can in theory get a layer lower if we use the JNI load class calls to add some of our classes to the boot loader. But that places a lot of restrictions on what we can hold.
Graph of privilege (higher is more access)
------------------------------------------------------------
Bootloader - can touch ANY Java loaded classes in any way (so long as you don’t request changes to memory layout)
====================
(Agent) - can inspect incoming classes and change permissions. Can change the memory layout of any class loaded after it is started.
System Loader – Can access anything allowed as permitted by module/access permissions <= JPype with a specific classpath listed
====================
URL Loader - Can access public methods from previous loaded classes (no special functions even if public.) <= JPype started without an explicit classpath listing
Dynamic Loader – Anything loaded by JPype after JVM is started lives in a ClassLoader that JPype created. Has the union of all the restrictions imposed by previous layers.
------------------------------------------------------------
JPype is currently not a module so it can’t be added to the access permissions of modules other than by giving access to all.
The access model has shifted at Java 8, 9, and 17 so some operations fail.
The usual issue that I run into is that I develop at System Loader level as I want to speed development, but most users use the URL level path. That means that I have to work around or just abandon features that fail for URL loaders.
You are correct. Unlike java.class.path specification, you can use as many agents as you want. That solves the issue of conflicts.
|
My solution was to call reflection and have it generate a stub class using ASM and interface using ASM. JPype implements those methods it wants on the interface portion and stub class forwards as needed. The hard part being that the interface needs to be able to also get ahold of the protected methods which would all be “_” leading members. We would then have to guard to make sure regular classes can’t access the underbar.
One way to trick it would be to change “self” at the edge condition so that we had a public and private version of the wrapper class. When we call overridden methods we hand them the private one. I am just afraid the private ones would naturally get leaked if something returns self, or places itself on a list.
The ASM pieces are very easy. It is proofing the rest.
|
For some reason I though ASM was GPL. Anyway just disregard that because it's not really important. I had chosen to make things accessible through I had used |
We do have one advantage in checking access. A call from within Java (which is the only way that it would be able to trigger as the borrowed methods all get processed so they don’t block the direct call path) should be able to trigger the check that we have access. In other words if you try to access it through getattr or other indirect it will say not allowed as java frame is not in scope, but calling it in the derived will pass.
|
@marscher I gave a shot at fixing collision between the two PR. Unfortunately, the bug regarding non-ASCII paths is a JVM problem and not really our issue. I find traffic on the Java forums going back in time and no good solutions. The solution in prior PR will break a number of behaviors because for many cases you absolutely MUST set the classpath when starting the JVM. Database services and such require it. I gave 3 hours trying to figure out what the encoding had done to the string from the os on CreateJavaVM which was causing an exception, but it is in a horrible spot for debugging. Barring recompiling the whole JVM with instrumentation I don't see how I can address it. The issue is that we properly encoded the vm arguments as UTF-8 when we pass it to Create_JavaVM, but the sun implementation of the URL encoder chokes causing a fail. My best solution was to pass ascii encoded arguments directly and then defer the others to late loading. But this just masks the problem as non-ascii paths are now second citizens and will work for some things and fail badly for others. Unfortunately none of this was the point of this PR, and my time is still very limited so this is the best I can do. Will this need additional review or can I merge it at this point and start our release cycle? |
@marscher I don't understand the error in the build process and azure is not letting me in to view. I will have to cycle back to this next week as I have exhausted the available time. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #1182 +/- ##
==========================================
- Coverage 87.20% 87.15% -0.06%
==========================================
Files 113 113
Lines 10296 10267 -29
Branches 4076 4045 -31
==========================================
- Hits 8979 8948 -31
- Misses 722 726 +4
+ Partials 595 593 -2 ☔ View full report in Codecov by Sentry. |
Finally the tests are passing except for Python 3.13. |
@Thrameos Im sorry, that this mess eat up your time. The tests for Py313 are failing because the github token needed to download the Python version is not valid (anymore). I will give it a shot in another PR. So far we can merge this and start the release cycle. Regarding the utf8 paths and jvm implementation (bugs) we should maybe just give a warning, that a path could not be considered and use the old behaviour instead of creating hard to find bugs due to second citizen treatment. What do you think? |
I dont think that the utf8 one is actually solvable. The original contibutor tried and ended up trying to load everything after startup. I tried and failed. I tested by setting another variable and then looking at resulting string type. The answer is that we are properly setting the encoding and there is nothing wrong with our string format. The problem is that during class loading it uses a defective url code that doesn't accept nonascii strings. Thus the only solution that the jvm supports is what i implemented here. (Ascii goes at load time, non ascii gets late load). The only alternative is to completely replace the system class loader with our own which dodges the bad url parsing code. (But if then we have to watch is the user hasn't already done the same.) That is more work than I have bandwidth for. So I recommend the bandaid solution for thus version (2nd class is better than no class) then try a major fix next year to close the gap. |
I can confirm I've already done the same. Ghidra sets the system class loader and uses the fact that it knows the system class loader class extensively (it checks during startup). |
@astrelsky would it make sense for you to try to make a PR what fixes the ascii issue with a system class loader replacement and test that we only do so when the user hasn't already done so? (Ie Ghidra) The issue here being I am already overdue for a release. I am not sure if this is a big or small ask. We have a working DynamicClassLoader.. |
Sure can. I'll have a look tonight when I get off work. Just need to know what needs to be done in the system class loader that fixes the issue. I know ghidra has also had problems when users place it in a non ASCII path. I'm assuming no matter what the system class loader would need to be in an ASCII path? (not really a concern for jpype) |
Our Dynamic class loader does not have that restriction. So a URL class loader with a proper path to sources will work. The problem is going to be bootstrapping; Our dynamic class loader works because it is using a full version of some classes (URI/URL) that only exist after the system class loader has loaded them. The one that sun used for initial loading is defective which is the source of the problem. That may mean that we have to implement some support class that is both ASCII and non-ASCII aware to get the process in motion. The secondary issue is that our class loader currently assumes it is a secondary class loader. Thus it wants the real system classloader as the fallback. For a system class loader the bootloader is the backup. Of course if the user has given their own classloader we need to fall back to the current position. There may be other requirements for a system class loader that I haven't researched. Unfortunately I have never had the need. As far as testing I believe that we have everything we need. We test both ASCII and non-ASCII. Thus the only test we are missing is user specified system class loaders (making sure we don't override the users wish if they have replaced it.) |
Ok well this problem is infuriating considering it might be a 1 line fix in the jdk by using I haven't done anything more then attempt to understand the problem yet, which I'm pretty sure I do now. Unfortunately, I can think of a way to break it in every single case even if jpype has it's own system classloader that works. If someone sets up a virtualenv with a bad path, the jpype jar will be in a bad path and it won't load. I thought that maybe it could be copied to Oh and yes, this bug is still present in jdk23... |
Yes, sadly this has to happen before our agent is loaded. If it was after the agent was loaded we can monkey patch the asm to change the loader to go to a different encoder. I understand how infuriating this is. I found issue running back for 15 years on the Java support pages (Sun and Oracle) meaning it would have been a small matter for someone to have fixed it. This really shouldn't be our issue. Barring hijacking the bootloader to replace the faulty code, we are mostly stuck with horrible work arounds. |
Implementing a tiny system class loader to be used to work around it should be trivial. I wouldn't even make any changes to the dynamic class loader and just add a new one. I think it would be best to only add it if there are any non ASCII characters in the path and there is no system class loader already set. I'll try to have a fix in tomorrow night. |
I figured, that the secret is not being passed on pull requests from forks. As you can see here, https://github.com/jpype-project/jpype/pull/1221/checks?check_run_id=31452501113 Python 3.13 gets downloaded. According to the Azure documentation one can enable the passing of the secret (which is of course not recommended), but there is no option in the gui, at least I didn't spot it (after search for half an hour...). |
@marscher How about we just set up a special run like release to test non-standard builds? Obviously that isn't as good as running the test cases for PR, but would allow us to build more complete testing before the release process. As only you or I will run it we can set up the proper tokens on jpype-project/jpype so we don't need to worry about forks. Just mirror the normal test script and add it to as a pipeline? |
I probably should have asked before I started with the ascii issue but what branch did you want me to work off of? I've already finished I think and am adding more tests. |
I put it here for now just to stash it somewhere before I call it a night. Feel free to just cherry-pick it if that makes your lives easier. There was one test case I couldn't figure out how to implement. It would require setting up a virtual environment placed in a non ascii path. This would cause the jpype jar to be in an unloadable path. |
Merging for release |
That sounds like a sane approach - although I'd have prefered to run tests for upcoming Python versions for all PRs automatically. |
Well, I hate to be the bearer of bad news, but I found the first unintended consequence of the Java agent. I don't think it's too big of a deal though because any classloader being used as a system classloader will most likely have this, even if I forgot to add it.
|
Default methods are a pain because the proxy to interfaces always redirects and there is no "normal" method of accessing such a method via reflection. The PR uses MethodHandles to perform the redirect. It is no clear if this works in Java 8 though as some of the comments regarding this method indicate that it may be a Java 9 feature. I added test cases to see if this is true. If it is only a Java 9 feature we can try to implement a mixed mode jar file, but that will get more complicated.
Fixes #1181