-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
All packages that contain non-package data are now likely installed in a broken way since 7.0.0 #2874
Comments
So FWIW, I have bugs popping all over the places because of this after the upgrade to pip 7 and I must revert to 6.11 for now. This is a rather major blocker for me and eventually would impact many project in a rather sneaky and hard to diagnose way as packages while installing OK suddenly stop to work because of files they cannot find at runtime. |
You can also use the flag in pip 7 that lets you turn off the wheeling of sdists for particular packages.
|
@dstufft thank you for the good tip and the quick reply! That can help, but in my case I eventually pip install automatically 100's of packages from multiple requirement files and I cannot easily single out a few packages. For now I have kept as sdist all the ones that were broken because of the wheel issue and that was a decent workaround |
You should be able to put that flag into a config file fwiw, and I think into the requirements file too? |
FWIW I just took a look at the packages:
|
@dstufft I agree with you that these are poorly packaged... I actually forked publicsuffix to publicsuffix2 to fix that... |
Sure :) We knew going into it, some things would have problems :/, the problem is Wheels have been around for almost 2 years now. I don't think giving more time is going to cause the remaining packages to fix themselves, so our choices are to either never make this change, or make it and let those projects work to get themselves fixed. Since this change is a massive improvement (when it works...) to people's lives we decided to go that way but provide the |
Oh they're just poorly packed? What a sensible thing to say ;) Well, actually it turns out that https://bitbucket.org/pypa/wheel/issue/92/bdist_wheel-makes-absolute-data_files Using I would suggest fixing the way that Wheel handles |
data_files has always been iffy in virtualenvs. Does the program using On Fri, Jun 5, 2015, at 06:19 AM, benjaoming wrote:
Links: |
@dstufft could there be a simpler way to detect packages that are likely problematic before even attempting to wheel them at all? After all the problems are rather narrow from my experience: these are essentially packages that provide data_files and that expect that the data_files will end up installed exactly where they were relative to the Python code in an sdist rather than relative to sys.prefix, which is rather a bug in setuptools than it is in wheel or pip. Now, wheeling these would rarely if ever fail, so pip would not fall back to a plain setup.py install there. I wonder if eventually a simple extract and reading the setup.py as a plain file and "grepping" for a "data_files" would suffice to make the decision to wheel or not to wheel? |
@benjaoming you wrote "Oh they're just poorly packed? What a sensible thing to say ;)" Sorry if that came out poorly... I meant poorly packaged in the sense that they there is a conflict in the distutils and setuptools approach to dealing with the same data_files directive and that is this is a poorly understood issue (I came to grasp it only recently) and that leads to poorly packaged code, not because of the packager:
wheel has been taking the distutil interpretation even though wheels are practically built with a setuptools extension.. pip was neutral until v7 because it would install wheels as wheels and sdist as sdist by running a setup.py install pip7 now runs setup.py bdist_wheel first which means that the wheel interpretation will be always applied. So the crux of the issue is the inconsistency between distutils vs setuptools and the distutils side taken by wheel and the relative prominence of setuptools vs distutils. Personally I was originally split on the issue... There are likely very few packages relying on pure distutils behaviour. In fact it sounds like instead most everyone even when using pure distutils expects their package to work with a setuptools interpretation of where data_files will land, as in the case of publicsuffix or selenium. I think this is likely because when you do install with pip the setuptools behaviour will apply even when the distutils code is used in a setup.py So after a good thought, I am not longer split I guess... the setuptools behaviour should be preferred. |
Honestly, projects that are using data files and are doing hacks to make it act like package_data, should just use package data. What's the point of having data files and package data if they both act the same? |
@pombredanne I don't think there's any reason to expand the behavior of setuptools. Ultimately, the ones who implement a setup.py will have to deal with the problems of wanting files in paths that are not relative to sys.prefix, and taking care to use sys.prefix in one's @dstufft please consider that not all data files are package-related, they can be |
I think the ability to write to arbitrary paths outside of |
@dstufft |
|
@dstufft !? maybe you wanna take that to a different issue. |
The key difference in view points is that you're coming at a place that the current solution allows X, so X must be preserved. That's now how most (if any?) of the people pushing Python packaging forward are viewing things. The current solution allows anything, so if we try to preserve that then every change is backwards incompatible and we might as well give up and just let things sit exactly how they are, some folks call this Postel's Tarpit. Unfortunately, this means that at some point we have to break things for some subset of users. We've been doing this with each release for different subsets of users:
That's just the stuff that I'm able to think of off the top of my head. Throughout all of this we are attempting to balance what we're breaking with the benefits of breaking it, and looking for use cases that we want to still support that aren't able to be handled by what we've left it with. For the feature at hand I've seen two situations where people are using it:
The first of those should be broken, it's silly to poorly reimplement a different feature instead of just using that feature. In my experience this is also the most common use case for the data_files feature and I believe that if we hold up improvements due to it we are doing the entire community a disservice for the benefit of a few projects. These projects have had almost 2 years to try using Wheel for their projects and to fix any issues with installing their project via Wheel. At this point It's unlikely that anything but breaking them is going to push them to fix their packaging. The second situation is one where I think there is a valid use case here, but I don't think the valid use case involves being able to write to arbitrary paths anywhere on the system. If someone installs your project into a virtual environment, you shouldn't be able to touch files outside of that virtual environment. The fact it is technically possible today because of the nature of Ultimately projects who rely on something that doesn't work while wheel'd should modify their As we attempt to "climb out" of Postel's Tarpit, it is inevitable that things break because as it stands, the attempts to do something (anything, no matter how broken or silly it was) given any input has caused a situation where any change breaks things. |
+1 to all that @dstufft said. To make progress, we have to establish policies which might not match current behaviour. The aim is to meet existing requirements, so that any valid use cases are still supported. But:
So, to cut a long story short - |
Guys, we're not deprecating Do you have suggestions on how Pip / Wheel can help track files installed in '/etc' ? Or do you encourage that users write their own setup.py-like logic for their applications' first-run? Because I can easily see lots of ugly behaviors being implemented in the application scope if Wheel doesn't want to deal with it.
No, that's not the difference. I understand the other decisions, but it's a logical fallacy to use them as justification. I wrote about current use cases of
This will most likely fail anyways. If someone is installing in a virtualenv and data files are put outside it, usually the virtualenv's pip instance won't have write access. In case it does, well, this IMO is up to the application's setup.py: Does it want to respect other existing files? Other installed versions? This is the sort of freedom that is needed for certain things. If |
We obviously haven't removed all old behaviors nor would we, to say anything else is a strawman. The ability to stick files in arbitrary locations is a misfeature, and supporting features that make that possible isn't something I'm interested in doing. Full stop. All files should be scoped to be within the scope of the environment they are being installed into. You've correctly pointed out that people can just do it by throwing some arbitrary Python into their
I mentioned this issue to an IRC channel I happen to be in that has a number of Python developers in it, and one of them commented (name removed):
I suspect that a large majority of developers do not expect that installing a project into a virtual environment (or a I also do not believe that it should be up to random authors whether they respect other existing files or other installed versions. You can argue that in some scenario this freedom is needed, and that is true I can probably come up with a scenario where something like that would make things better for that one particular project, but part of designing a sane software ecosystem is being able to look at a particular feature and being able to say "no". We recognize that there are use cases where this may be useful, and we invite those people to help us come up with a solution that satisfies both their constraints and our constraints. Until then, they have a number of solutions available to them:
They had the last two years to play with Wheels and be proactive in ensuring that they don't break with them. If they've failed to do that over the last 2 years, we're now at the reactive stage where they have to react to the fact that they need to either fix themselves or put up work arounds. Eventually (another 2 years? Who knows for sure with volunteer timelines) if they only institute work arounds they'll have to react again when their projects stop being installable all together. |
@dstufft you wrote:
I could not agree more. I would go even further to say that the ability to write to arbitrary paths outside of where the root package is installed is a misfeature which allow to mistreat Python packages for system packages... |
@benjaoming you wrote:
My 2 cents, if this is the use case you are after, then IMHO Python packages be they wheel, eggs, sdist, bdist are not what you should be looking for. If you are on Linux use the package manager of your distro for this instead. |
Well, essentially the packaging system (Pip+Wheels) is creating value by managing what's installed on a system. If a piece of Python software needs data files outside of the environment (where package data lives), then we all agree that there are valid use cases. So the question is: Does Wheel want to create this value by accommodating scenarios where distributed files are placed outside of the OS-universal Python environment ? If not, Pip+Wheels can remain a great tool for certain things, but you will definitely see people reinventing lots of wheels to reimplement an often-need feature for system services and desktop applications. Your mindset seems to be on Python libraries and command line scripts. So in case you leave this as is, future Python package distributions may start leaving around unmanaged data files and use the same hacks that you were trying to avoid in order to force non-OS-specific behavior. But you can't really force it, when essentially it doesn't exist :) Q: I want to create a package on PyPi that can be installed with Does the above sound correct? |
If you want to install desktop shortcuts or start up services then start up a thread on distutils-sig about that so we can collectively figure out if we want to support that, and if we want to support that how we can best do it in a way that will work across all the various platforms as well as virtual environments, user installs, and such. Things that can be explicit designed and ultimate controlled passed on to the end user whether they want that thing or not. |
@benjaoming To me the Python packages "system" is about discovering, provisioning and installing Python libraries used in a larger context of Python-based applications. This is the approach of most if not all language-specific package management systems for Ruby, Perl, JS, Java, etc. MO is that the provisioning and installing of complete applications including things that deal with system services, desktop integration, databases and more is best left for other tools to handle. In most cases with the OS package tool (debs, rpms, brew, chocolatey, native installers, etc) or with tool dealing with configuration "in the large" such as buildout and Ansible, Salt, Puppet, Chef of the world. And containers. I would never attempt to use only pip and wheels for this. |
For my case, it's fine to have But consider what happens if you do not implement a backwards-compatible way of handling My proposal is still that it should work as before, but with your objections intact: It's discouraged to place files outside the userspace Python environment. But we can still embrace the freedom and value of having the ability in Wheels to install and track these files. Btw |
@benjaoming I wonder if the number of actual packages using data_files is not rather small |
@pombredanne no way of telling quite yet I suspect :) Stirring the waters to see what happens is inevitably what's happening now... But this Google does look like there's a substantial bit of setup.pys out there with |
Personally, I think the distutils behavior makes more sense. If we want something relative to the package dir we already have |
I agree, see also setuptools #460. |
I just released my first python package, and it is affected by this issue. I would like to avoid absolute paths, as suggested, but I do not know the proper way. Can someone give me a hand? Here is a link to my stack overflow question that goes into more detail. |
The file contains outdated information and nobody is maintaining it any longer. Plus it only works on Linux, and only if pip installing as sudo (which is bad!). We need proper Sphinx+ReadTheDocs kind of documentation.
Since 7.0.0 sdist are converted to wheels before being installed. As a result, a package is never setup.py install'ed but always bdist_wheel'ed first.
This means that every package that contains additional files hit the wheel bug https://bitbucket.org/pypa/wheel/issue/120/ or/and https://bitbucket.org/pypa/wheel/issue/92/bdist_wheel-makes-absolute-data_files which are the same issue.
This includes breaking the pip install of a large number of packages that could only be installed from sdists because of this wheel bug such as the reasonably popular selenium, pdfminer, publicsuffix and many more.
And since an sdist is always wheeled before install, there is no work around that is possible anymore: forcing the use of an sdist will always go through the wheels mechanics anyway.
The solution short term is to stay on pip 6.x
The solution must be to fix pip asap to not use wheel always and/or to fix wheel ASAP for these bug: https://bitbucket.org/pypa/wheel/issue/120/ https://bitbucket.org/pypa/wheel/issue/92/
The bug was introduced in pip by #2618 and has not been marked as a breaking change in the release notes, but this is a major breaking change that likely breaks the installation of hundreds of packages starting with pip 7.0
The text was updated successfully, but these errors were encountered: