-
Notifications
You must be signed in to change notification settings - Fork 15.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
python: use relative imports in generated modules #1491
Comments
I have the exactly same problem, hope to be fixed |
@goldenbull I submitted a fix, let's see if it makes it through. I'm just not sure: are there cases where we don't want relative imports? |
@little-dude how about if |
Could you provide a small example of what you're thinking about, so that I try it with my change? |
maybe this is not a good case, I don't have enough knowledge about how protobuf/python/etc. deal with the importing |
I don't think this is actually possible because the generated modules follow the hierarchy of the proto files.
with the following:
We generate the code with:
which generated the following:
But this is a more complex case than what I am trying to fix. Edit: I'm not even sure this is a valid case but here is the error I'm getting with the master branch (4c6259b):
|
@haberman is there any chance for this to be fixed before next release? It's quite limiting for python 3. |
I have exactly the same problem (using protobuf v3beta3), the generated imports do not conform to PEP 328 (finalized in 2004), restated in the Python docs: https://docs.python.org/3/tutorial/modules.html#intra-package-references - the 12 yr-old specification is enforced in Python3, so generated protobufs are unusable without further modification. |
any updates on this? |
Yikes, sorry for the slow reply on this. Wouldn't relative imports break the case that you are importing protos from a different pip package? For example, the well-known types come from the google-protobuf pip package. If we merge a change to use relative imports, imports of |
@haberman This bug has to do with protos importing protos in the same package, even in the same directory. The compiler converts this into relative imports with defective syntax under Python 3, so the generated code cannot execute at all. I don't see how you can get away without using relative imports in this case. I've had to manually edit the compiler generated pb2.py files to get them to work at all. |
+1 for fixing this bug. It's stopping me migrating from python2 to python3. |
+1 for fixing as well |
+1, as far as I can tell this completely prevents proto imports in Python 3. Seems extremely worrying that this isn't fixed. EDIT: this is not quite right, see my comment below. |
+1 for fixing |
I believe protobuf is working as intended in this case. The python package generated for a .proto file mirrors exactly the relative path of the .proto file itself. For example, if you have a .proto file "proto/a.proto", the generated python code must be "proto/a_pb2.py" and it must be in the "proto" package. In @little-dude 's example, if you want the generated code in the "generated" package, the .proto files themselves must be put in the "generated" directory:
with protoc invoked as:
This way, the output will have the correct import statements (it will be "import generated.a_pb2" rather than "import a_pb2"). Using relative imports only solves the problem when all generated py code is put in the same directory. That's not the case when you import protos from other projects though (e.g., use protobuf's well-known types). It will likely break more than it fixes. |
I am confused by the claims that this is totally broken in Python 3. We have Python 3 tests (and have for a while) that are passing AFAIK. Why would Python 3 require relative imports? |
The issue that I'm having and that I believe others are having is that the proto code There are already multiple existing issues concerning this problem, and it first seems to have been recognised in 2014 (!!!): #90, #762, #881, #957 |
I read a bit more about Python 3's import rules and I think I can give a better explanation. In Python 3 the syntax If you are compiling protos as part of a Python package (which is the case in most non-trivial Python projects), the interpreter's current working directory is the directory of your main module (suppose the directory is In Python 2, a module inside In Python 3, implicit relative imports don't exist and The root of this problem seems to be that Pre-compilation:
Post-compilation:
This is close to the desired result, but not quite it, because now the absolute reference to the compiled file is In this instance you can actually get it to produce the right output by specifying the python output path I think the 'correct' fix is to make use of the proto package rather than the location of the proto in the directory structure. |
@DanGoldbach Thanks very much for all of the detail. I think a lot of the confusion here has been a result of not fully explaining all of the background and assumptions we are making. The more full description really helps clarify things. Let me first respond to this:
Can you be more specific about exactly what fix you are proposing? An example would help. One thing people seem to want, but that doesn't seem to work in practice, is that a message like this: package foo.bar;
message M {} ...can be imported like this in Python: from foo.bar import M That is a very natural thing to want, but doesn't work out, as I described here: grpc/grpc#2010 (comment) Overall, your directory structure appears to be more complicated than what we generally do at Google (which is the environment where all this behavior was designed/evolved). At Google we generally have a single directory structure for all source files, including protos. So we would anticipate something more like this: Pre-compilation:
Post-compilation:
Because protobuf thinks in terms of this single, flat namespace, that's why we get a little confused when people talk about needing relative imports. I haven't wrapped my head around why this is necessary. Why doesn't the scheme I outlined above work for your use case? |
Thanks, I understand much better now.
I meant that it would be nice if the compiled proto module hierarchy mirrored the package hierarchy specified in the proto source file. As you pointed out in the grpc thread, this isn't feasible right now. Maybe in the future, the one-to-one restriction between proto sources and gens can be relaxed. It sounds like protos work best when the generated files compile to the same directory as the source files, as per your example. Our directory structure has a separate
We explicitly keep generated and source files separate, so your scheme doesn't suit our current repo layout. We would also like the option of using those protos in multiple distinct Python packages in the future, so generating compiled protos into one particular Python package isn't ideal. At Google this isn't an issue because IIRC the entire repo acts like one massive Python package and blaze provides you with the bits of the repo that you need. I think we'll get around this by either adding the compiled proto directory to our Python path or by writing a build command to manually edit the imports in the generated protos to be package-relative imports. Hopefully this helps other people reading the thread. |
Cool, glad we're getting closer to understanding the problems.
I think this would be difficult to do. Right now we guarantee that:
...is equivalent to:
This is important because it's what allows the build to be parallelized. At Google we have thousands of It's also not practical to try and ensure that all So with these constraints we're a bit stuck. It leads to the conclusion that we can't have more than one
Usually for this case we would put the generated code for those protos into a package that multiple other packages can use. Isn't that usually the solution when you want to share code? If you have |
I hadn't considered the constraints placed on the proto compiler by Google's scale and parallelism requirements, but that makes sense. I guess I can compile the protos into their own proto-only package in |
@DanGoldbach Thanks for your example -- it helped me solve a problem. I think your example work as desired if you run
It generates correct import lines and places the |
I am not sure why this is closed and what the official solution to this is. There is a lot of external tools and sed here Official documentation states:
But if that is the case how do we utilize it with imports? |
It surely does! It still feels like a "hack" but there is no better option. |
I'm fairly new to this, but if the point of protobufs is that they can be shared across projects, the protobuf's file structure should not be the determining factor in setting the module structure. Ideally, each tool that generates code from the proto files should be able to define the output, right? |
I'm on Pyhton 3.8.8 and facing the same exact problem. The generated code with If I use relative import like |
@haberman I'm struggling to understand the connection between the parallel compilation issue you described regarding Betterproto (this comment) and the suggestion of using relative paths for imports. You said in this later comment that:
The Betterproto example you gave shows the problem with generating Python module names based on Proto package names. But surely generating relative paths wouldn't break parallel code genertation? Edit: The approach used by More concretely, I am currently working on a project that generates both |
Protoc just needs to generate from . import my_pb2 as ... That would simply solve the issue. |
It would break existing uses, as described in #1491 (comment) If
Whereas with the existing (absolute) import, it works. |
@haberman The aforementioned from ..t2 import my_pb2 |
I tried this import line in my example. It still did not work: /tmp/t2 $ PYTHONPATH=/tmp/t3 /tmp/venv/bin/python3 -m test_pb2
Traceback (most recent call last):
File "/Applications/Xcode.app/Contents/Developer/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/Applications/Xcode.app/Contents/Developer/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/private/tmp/t2/test_pb2.py", line 14, in <module>
from ..t2 import my_pb2 as my__pb2
ImportError: attempted relative import with no known parent package Note, I am running this command from |
@haberman As the error indicates, relative paths require a shared parent module, which requires an Directory structure:
Import line: from ..t2 import my_pb2 Invocation (from the directory above PYTHONPATH=. python -m protoimport.t1.my_pb1 As I mentioned, the |
Right, but this means that if we changed protoc to output the code that you suggested (the code that It's true that this can be worked around by changing the project structure, as you suggested. But the same is true of the initial complaint in this issue: it can be worked around if you change your project structure. The core of this issue is:
If you follow the invariant in (2), everything will work. As far as I can tell, the only reason that people need |
@haberman Would it be a breaking change even if I'm not sure how this would happen, since in the above example, the absolute import path would presumably be ...or are you talking about the case where |
Yes, that case. Many Python packages distribute We need to make sure that imports of proto generated code continue to work across Python packages.
The example I provided is essentially the same as the example given in Protoletariat's README. It also matches the original message in this issue. I just tried this with Protoletariat and verified that indeed it produces this output. Have you been able to try the solution I suggested?
You can respect this invariant either by:
|
or... the original package that everyone is using could just be fixed since all of us complaining about it.... maybe if we complain for another 7 years... they might listen to us.... /s |
I appreciate that; thank you.
Those are both "flat", aren't they? Neither has a
Yes, I have a working Python library generated from the standard
I'm publishing |
@BatmanAoD For pip packages, you might check out my repo with an example of how to set up a pip installable protobuf repo: https://github.com/sbrother/python-protobuf-repo-example. Under the hood it uses a custom setuptools command to compile the protobufs properly on More generally, this issue is never going to get fixed, because Google has millions of protobufs that live alongside the code and are compiled using an internal version of bazel. They literally "import directly from file paths" like you say, and builds are massively parallelized so it's imperative that I strongly recommend using betterproto, which works the way people outside Google expect. Josh is correct that it isn't quite a drop-in replacement -- in particular it solves this issue while breaking the way that protobufs are used inside Google. But we use it at my current employer (10s of millions of DAUs on a Python app) with no issues. |
Just to make sure I understand, you have a package named something like This can be done with protoc. You just need to make sure that your protobuf import paths also start with import "myproject/some.proto"; Proto import paths are just like Python import paths: it is expected that they will be globally unique. So they should probably be qualified by your project name, just like your Python imports are. For example, that's why the Google well-known types are imported as I have personal Python projects outside of Google that use protobuf, and I have never run into the problem described in this bug. I put the |
@haberman Perhaps it is time to lock this issue. |
Just as a side node. In thrift/Facebook the import path is configurable which is an alternative solution, but would require some retooling to respect that on imports and they have a massively large distributed build system as well. I think the approach on this is super idealized and not all of us have the luxury of redoing our entire project structure to fit this idealized . Luckily there are work around, but there is risk that they go unmaintained or break with assumption upstream which I think why there are many of us begging for a fix in protoc itself (that is backwards compatible of course). |
@sbrother thanks for the pointer to your example; I'll take a look at that. We actually did try to switch to Betterproto, found that it was orders of magnitude slower at runtime (I wasn't involved yet when this happens so I can't provide more detail there), and now maintain two pip packages, one generated with Betterproto and the other generated with the standard tool. That's part of why it doesn't really make sense for the protobuf import paths to start with Python package paths: there are two Python packages representing the same protobuf files.
Correct; but the pip package name is based on what the set of protobuf files conceptually are (a specific company's service/domain model specification), whereas the import paths in the I think I at least have a better understanding now of the fundamental disconnects between our setup and the setup that would break if relative paths were generated, so thanks for walking through it with me. I do think our setup is a pretty reasonable case, and based on the thread it doesn't seem to be unusual, so it would be great if the plugin could provide an option or something (like how the Go generator provides |
Maybe just adding a non-default flag in protoc would be enough? |
Just came across this issues. It's very sad seeing how maintainers keep ignoring the community for 7 (!) years when this can be simply resolved by adding a couple flags so the users can configure their python imports. |
I have a package
foo
that looks like this:Generate the code:
protoc -I ./data --python_out=generated data/a.proto data/b.proto
.Here is the failure:
This is beacuse the generated code looks like this:
If the import was relative it would actually work:
The text was updated successfully, but these errors were encountered: