-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use of pbr breaks cx_freeze applications #385
Comments
I second that, I have the same problem: frozen applications (using PyInstaller or cx_Freeze for example) break with this error:
Some alternatives to using Both can generate a Some questions:
|
So, the APIs to query installed app metadata are in principle compatible with freezing - and if they are fixed, then lots and lots of libraries will start working. Why are you embedding mock - a testing library - into frozen apps anyway? I'm not super keen on changing btw, but will consider it if someone does the work. Requirements:
|
(Oh - and because I don't want someone doing all the work and me saying no, please at least get consensus here on the proposed direction before sinking a lot of time in) |
A note on the "why": tests should always be run against frozen applications to make sure that the tests actually grab what would be distributed to the final user, not a developers dev environment (the test task makes the freeze, copies the tests to a proper place to be added to the PYTHONPATH and runs it with the freezed library and the proper tests without any other external references). |
I'm not sure, here's the implementation of def get_version(package_name, pre_version=None):
"""Get the version of the project.
First, try getting it from PKG-INFO or METADATA, if it exists. If it does,
that means we're in a distribution tarball or that install has happened.
Otherwise, if there is no PKG-INFO or METADATA file, pull the version
from git.
We do not support setup.py version sanity in git archive tarballs, nor do
we support packagers directly sucking our git repo into theirs. We expect
that a source tarball be made from our git repo - or that if someone wants
to make a source tarball from a fork of our repo with additional tags in it
that they understand and desire the results of doing that.
:param pre_version: The version field from setup.cfg - if set then this
version will be the next release.
"""
version = os.environ.get(
"PBR_VERSION",
os.environ.get("OSLO_PACKAGE_VERSION", None))
if version:
return version
version = _get_version_from_pkg_metadata(package_name)
if version:
return version
version = _get_version_from_git(pre_version)
# Handle http://bugs.python.org/issue11638
# version will either be an empty unicode string or a valid
# unicode version string, but either way it's unicode and needs to
# be encoded.
if sys.version_info[0] == 2:
version = version.encode('utf-8')
if version:
return version
raise Exception("Versioning for this project requires either an sdist"
" tarball, or access to an upstream git repository."
" Are you sure that git is installed?") It tries these methods to find the version, in order:
None of those methods work with a frozen executable. Perhaps I'm missing something regarding how meta-data can be found in frozen applications?
We execute our tests using the frozen application, to ensure we are packaging everything correctly. We do this by having passing a command-line option to our application ( This is the main problem I'm attempting to resolve:
Oh this is generated automatically by |
The main issue seems to be that cx_freeze isn't properly incorporating the package metadata. I suggest that issue should be resolved directly, or mock should provide a fallback behavior when the metadata isn't available. I'd rather not that every package resort to bypassing the proper metadata channels in order to function in environments that fail to supply proper metadata. Mock could do this:
Or maybe this:
This approach would plaster over the issue until proper support for the version metadata can be supported by cx_freeze, but would still rely on the proper metadata in environments that support it. |
I understand, but I have the sentiment that even that won't be sufficient. I didn't investigate other packages which deal with metadata if they support reading from zipfiles produced by freeze tools such as I personally like |
So, if absolutely no package consults the metadata, then yes, you can avoid the cost, but its a tragedy of the commons - there is no incentive to workaround it for any one package unless they are the 'one package' causing issues; that said, I'm not sure why pbr is manually parsing the metadata file instead of using get_distribution, I'm willing to bet someone did an optimisation some time back; I don't think it was me :P. Point is though, that that fix is cheap - use the right API - and then fix cx_freeze, and whole classes of things will work properly that don't right now. The example of 7.6 seconds to import is bad - its possible, if mock was involved, that its an install missing the metadata files - if so, its self inflicted:t he metadata is PEP described, its part of Python, and avoiding it just makes everyones life harder. |
Well in this case unfortunately it is the use of
I agree that would be the right approach. I tested to see if including the metadata information in the zip file would fix this, but it seems things are still broken, for
The Now testing
That's expected, it is the same error produced by I get a similar error using the
It seems the usual packaging tools don't work when you have a unique zipfile with several distributions inside it, so it is not just a matter of fixing Btw, thanks everyone who has contributed to the discussion so far, I really appreciate it. |
No, you're right.
It's a known defect of pkg_resources (though at the moment I can't find a specific ticket) that it performs a large amount of logic during import time. Indeed it shouldn't be an expensive operation to retrieve the metadata for a single, installed, specified distribution, but at the moment it is. The solution shouldn't be to additionally store the required metadata using a private convention and then load it from there. The solution should be to continue to refine the specs and implement the PEPs and get the tooling to perform well such that there's one obvious (and performant) way to retrieve metadata. |
Oh OK, I misinterpreted it then, thanks for the clarification and confirming the current state of things.
I didn't realize that we have a recommended convention for obtaining the installed version from within a package ( But I certainly agree that we as a community would benefit from a clear, efficient and well-defined standard than having each package adopt its own solution. You mention "the specs and implement the PEPs", do you have any pointers to them handy? PEP-376 would be an example of such PEP? |
It's a known defect of pkg_resources (though at the moment I can't find a
specific ticket) that it performs a large amount of logic during import
time. Indeed it shouldn't be an expensive operation to retrieve the
metadata for a single, installed, specified distribution, but at the moment
it is. The solution shouldn't be to *additionally* store the required
metadata using a private convention and then load it from there. The
solution should be to continue to refine the specs and implement the PEPs
and get the tooling to perform well such that there's one obvious (and
performant) way to retrieve metadata.
This seems quite a lot of work for having something for which a *much* simpler approach would make a non-issue from the start... I do hope that if such PEP actually arises it's sensible enough to just replace a string or create a module such that __version__ = '2.0.0' is set at release time and not have to calculate it at import time as it is now (it makes something which should be really simple complicated)...
Not that it can't have APIs to query dependencies or get versions for modules, but regarding the __version__ for the *current* module, as it's usually always set at import time, it seems backward that it actually depends on importing anything -- really, when I'm looking at source code I expect to see the version for it there, not have it computed and changed depending on other external dependencies. Just as a reference, see: https://bugs.python.org/issue28637, where a report on *enum* was actually considered a bug because its performance implications (i.e.: this really matters).
Also, it seems we agree that the current approach doesn't **currently** work (not that it won't in the future, but at the present it seems it's not really there for general consumption).
|
I also didn't realize that
Yes, PEP 376 is a good start, though I see it was relying on Distutils 2, which never came to fruition. So pkg_resources.get_distribution is probably the best approximation for that functionality. To the extent that the specs and PEPs don't provide a clear specification, I believe the intention is still there - that the recommended place for metadata (including version number) to be defined is in the package metadata.
I'm not suggesting that the version be calculated at import time, just resolved (linked) to the same information in the package metadata. There are some real, practical challenges with simply copying that value to a file at build time (release time). The first issue is that the value gets stored twice (rather than stored once and referenced), leading to potential for inconsistency. The second issue is that it's not obvious where such a file could be reliably generated across all packages, especially when you consider namespace packages. Consider the packages pmxbot and pmxbot.rss - where would a build tool inject the versions for those packages? Should all modules in a package get a version, or only top-level modules/packages? And what defines a top-level package. A third issue is that of source control. You say, "when I'm looking at source code I expect to see the version for it there," but this expectation can't be met in general. If you have a version there when the code is unreleased, that version will be incorrect for all commits except the released ones. Additionally, if that file is modified as part of a release process (to inject the version being released), that causes files under source control to be modified, requiring additional consideration for development environments. A fourth issue is that many projects use SCM tags to designate releases in the source code, which is an additional place that the version numbers need to be indicated. Tools like setuptools_scm attempt to unify those versions as well, allowing the version number during a release to be designated in exactly one place and thereafter derived for the package metadata and imported packages. A fifth issue is that packaging tools like setuptools allow adding tags to a version at build time. For example, setuptools adds a So while it may seem like an over-complication to define the version in metadata and resolve it through an API, there are some real and practical reasons for doing so, addressing limitations of a simple version file for simple packages. Most importantly, the principle of not repeating one's self and linking values rather than copying them, coupled with the fact that the version number must be advertised in package metadata, means that the package metadata is where the version number is best indicated. It seems to me there are two issues at play here. pkg_resources is slow to import (pypa/setuptools#510) and pkg_resources isn't readily available in stdlib, which has an additional limitation that it's bundled with setuptools and can't be installed separately. Does that sound about right? |
Hi jaraco, thank you for your follow up...
I think this depends largely on the approach chosen... you can still devise a way to have it only at a single place (either from scm or on a version somewhere). For a reference, requests just reads the version from the source (https://github.com/kennethreitz/requests/blob/master/setup.py), which seems a sensible approach to have it only defined at a single place. Another approach could be getting it from SCM and storing it at release time, where the unique place would be the SCM (and it'd remain undefined until release time).
Well, for namespace packages, I think that each package inside the root namespace is a completely different beast with its own version, so, I don't see any issues there (personally, I think that namespace packages go a bit against the very nature of Python, but if you must, then every package inside it should be completely independent anyways and should have its own version -- and you have those same issues with any approach chosen, probably even worse as fetching the metadata for those cases makes things even less straightforward).
I'd like to politely disagree when in most projects I can just go to github and see it. I.e.: Also, if you have the version derived from SCM, then you'll only know the version at release time (which is a different approach -- and at this case, there are also solutions, such as having a separate module generated for the version -- and have it at .gitignore and
I think that if this is the approach chosen, then at build time it seems it'd be straightforward to update the module to have the proper version -- and then, by definition a version would only be valid at release, trying to put on a version from git doesn't seem consistent -- i.e.: if the user gets a tarball from a branch, what's the version? Why is it different if checked out using git? I'd say that in this case, the version should always be undefined for consistency.
This seems fair to me (i.e.: not an issue: you can do whatever you wish at release time, as long as it's properly installed later on).
Not really. The major one reported (i.e.: that it doesn't work properly with cx_freeze nor any other "freeze" version) is still an issue. Also, think those are nice tools, but having them to set the |
In all of these cases, it's possible to mistakenly mis-label a release. Consider requests:
In this case, you've installed code that's indistinguishable in behavior from 2.12.3, but is indicated as version 2.12.2.
The issue I see is that if you browse to a particular commit, the version indicated is the same version as many adjacent commits, only one of which might be the official release of that version (or that version may never have been released). The version in that file is more often incorrect and misleading that correct. It does, admittedly, give an approximation for the proper version. Thinking about tags, consider if one accidentally or maliciously tagged a version that differs from the version number in the file system. I've seen this happen on dozens of occasions... because it requires two separate manual steps to be correctly executed in order, leading to tools like bumpversion, but even those are subject to operator error. I personally seek a solution that minimizes repetition and potential for operator error. Still, I respect that other projects might prefer to do more of these steps manually and manage the consistency through convention. I'm okay with that, though I still struggle to think of a recommendation one could make that works in the general case. As I think more about the issue with namespace packages, I realized the issue is less about the use of namespace packages but an issue with the disparity between a distribution's version and a Python package's version. I admire allowing for (and recommending) the developer to derive a version number (such as So I guess to summarize, while I welcome projects to expose If the maintainers of mock want to stop using this API and maintain the version another way, that's fine, but that approach is only an incomplete workaround leaving the underlying cause (inability to get distribution versions in cx_freeze applications) unaddressed. |
@jaraco what's your opinion on the approach taken by setuptools_scm, which has the option to derive the version at build time and write a small Python file with that information, which is then exposed by the package via simple In my point of view, this has the benefits of automatically deriving the version number from tags while having no extra run-time overhead to provide the version when installed.
Indeed it is mostly a workaround, but is it common to query other information from the distribution (such as installed files)? I think it boils down to the fact that alas the version of a package/module is defined traditionally in a |
I'm generally in favor (+1). I don't use the file-writing feature of setuptools_scm, but if that works for you, I think that approach alleviates a number of my concerns.
It must be resolved to something, which could perhaps fallback to a constant value like |
I see, thanks. Using
Sorry, I meant to use the current methods in a function, like this: def get_version():
import pkg_resources
try:
return pkg_resources.get_distribution('setuptools').version
except Exception:
return 'unknown' And then users can get the version by calling |
I thought I would file a ticket with cx_freeze to capture this need and recommend (at least at a high level) a path to a solution. |
FWIW there's #362 which reverts the use of |
#362 was actually about providing a stub ChangeLog as a convenience to some folk, which I'm fine with, but the PR had an additional commit added adding in manual versioning, which I'm not. |
@rbtcollins oh OK, thanks for the clarification. |
CX freeze ticket is now at marcelotduarte/cx_Freeze#216 |
I'm closing this now, since we seem to all be of the understanding that cx_freeze is incompatible with some base packaging PEPs, and the onus is on that project to fix it. |
The given code:
is not really friendly to frozen applications... also, I'd say, it adds a lot of logic under the hood to just to the version (besides adding having a runtime requirement to setuptools, which is usually just a setup time requirement), so, I'd like to check how feasible it'd be to revert it to just coding the
__version__
and version_info directly into the source code -- also, I'd say it makes it easier to know the current version by looking at the source code and makes the code clearer ;)The text was updated successfully, but these errors were encountered: