Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encourage users to upload wheels for pure Python packages #496

Open
rth opened this issue Apr 20, 2021 · 19 comments
Open

Encourage users to upload wheels for pure Python packages #496

rth opened this issue Apr 20, 2021 · 19 comments

Comments

@rth
Copy link

rth commented Apr 20, 2021

Issue originally opened at pypa/twine

I was wondering if there is any mechanism that would allow encouraging users to upload wheels for pure Python packages? Unless this should be a feature request for https://github.com/pypa/warehouse?

Problem statement

Currently some package maintainers only upload a .tar.gz to PyPi for pure Python packages, without the associated wheel. For instance, according to the analysis in https://gist.github.com/josephrocca/ca3f09c4db1df6288bbc64428e6bed77 around 1/4 of 100 4000 most frequently downloaded PyPi packages only include the source distribution. Some maintainers might forget, other may not be aware of the motivation to do so.

This results in,

  • slightly slower installs since the wheel is then built locally (which is fast for pure Python packages, but still adds some overhead)
  • harder to analyze package artifacts (with a dynamic behavior in setup.py)
  • and in particular case of https://github.com/pyodide/pyodide inability to install the package from PyPi (Use wheel package to create .whl if not available? pyodide/pyodide#1501). We keep asking maintainers to add wheels but approach doesn't scale, and we were wondering if some centralized solution might exist that would benefit everyone, and slowly move the ecosystem in that direction.

For instance poetry publish would already upload both the .tar.gz for pure Python packages.

Possible implementation

Given that artifacts can be uploaded with multiple calls, I'm not sure if it's something that would be technically possible even in case of interest in this feature.

One possibility could be to display a note/warning, if twine is used to upload the sdist and the corresponding wheel is not present for pure Python packages. Though that would likely produce false positives.

Another possibility could be to emphasize this a bit more in the documentation.

@josephrocca
Copy link

josephrocca commented Apr 20, 2021

around 1/4 of 100 most frequently downloaded PyPi packages only include the source distribution

Minor correction on this: ~25% of the top 4000 most frequently download packages (according to this list) don't have a published .whl file. Only about 3% of the top 100 are missing the .whl. Doesn't change much about this issue, but I figured I'd clarify in case someone tests the script and is confused.

@sigmavirus24
Copy link
Member

Currently some package maintainers only upload a .tar.gz to PyPi for pure Python packages, without the associated wheel.

Is that a problem if those package maintainers don't feel that wheels are important or valuable enough to generate? Put another way, I recall people feeling that binary distributions restricted "freedom" and were vocally against introducing the wheel format.

Is this less than ideal? Certainly.

Is it possible these are ideologues refusing to distribute something they fundamentally disagree with? Yes.

harder to analyze package artifacts (with a dynamic behavior in setup.py)

Some people are also set in their ways and want to have that dynamic behaviour which also rules out wheels for them. These two things are not entirely separate.

We keep asking maintainers to add wheels but approach doesn't scale, and we were wondering if some centralized solution might exist that would benefit everyone, and slowly move the ecosystem in that direction.

Unfortunately, the best motivator to do anything will always be "My users find this valuable". The next best will be "this tool I use suggests I do so and links to compelling reasons to do so and/or automates doing so for me".

As for "centralized solution", as far as I know, twine isn't the only path to uploading artifacts to PyPI. And as you said,

Given that artifacts can be uploaded with multiple calls, I'm not sure if it's something that would be technically possible even in case of interest in this feature.

Twine could then query the index to determine if there's a wheel that exists, but we don't do that today and I'm not convinced we should start. If something were to give this warning, it should be PyPI. And I won't speculate on a possible implementation.

In general, this issue probably belongs on pypa/packaging-problems (if a similar issue doesn't already exist). Let me know if you'd like me to transfer this issue over there.

@rth
Copy link
Author

rth commented Apr 21, 2021

Thanks for your response @sigmavirus24 !

In general, this issue probably belongs on pypa/packaging-problems (if a similar issue doesn't already exist). Let me know if you'd like me to transfer this issue over there.

Yes, if you could move it it would be great. It's related to #25 (but that one focuses more for wheels for packages with C extension).

@sigmavirus24 sigmavirus24 transferred this issue from pypa/twine Apr 21, 2021
@di
Copy link
Member

di commented Apr 21, 2021

I think this is probably a duplicate of #25, which I don't think is exclusive to wheels with C extensions.

I think the "Create a generic wheel-building service to make releases faster and more robust" part of https://github.com/psf/fundable-packaging-improvements/blob/master/FUNDABLES.md is also relevant here.

@henryiii
Copy link
Contributor

Build creates both wheel and SDist by default, so that helps here I think. And tools like cibuildwheel helps making binary wheels (ideally) on CI if it's not pure Python. As long as docs point to making and uploading wheels too, which I think ours do, I don't know if there's that much that can change.

@rth
Copy link
Author

rth commented Apr 29, 2021

I think the "Create a generic wheel-building service to make releases faster and more robust" part of https://github.com/psf/fundable-packaging-improvements/blob/master/FUNDABLES.md is also relevant here.

If PyPi could build the wheel automatically for pure Python packages, that would clearly be the ideal case. But I imagine that would take lots of effort and funding...

#25, which I don't think is exclusive to wheels with C extensions.

It's not, but the subset of that issue discussed here, should also be less difficult to address from a technical/UX point of view (since it doesn't need to deal with the building C extensions discussion).

Build creates both wheel and SDist by default, so that helps here I think.

What command do you mean by "build"? Yeah, if the default workflow was

twine build   # generates sdist + wheel for pure Python packages
twine upload  # without specifying the path, upload both

(or using some other tool) then most of this issue would be addressed IMO.

@henryiii
Copy link
Contributor

pypa/build, that is,

pip install build
python -m build

or

pipx run build

(Which works on GHA out of the box with no Python setup!)

See https://packaging.python.org/tutorials/packaging-projects/#generating-distribution-archives or https://scikit-hep.org/developer/gha_pure for examples.

@pfmoore
Copy link
Member

pfmoore commented Apr 29, 2021

What command do you mean by "build"?

https://pypi.org/project/build/

That's the build tool recommended in the tutorial.

@rth
Copy link
Author

rth commented Apr 29, 2021

That's great, thanks! I missed that development.

Just need support for,

twine upload   # without a path

then...

@rth
Copy link
Author

rth commented Apr 29, 2021

To elaborate, I'm not sure this applies to other maintainers, but I would probably not run

twine upload dist/*

as recommended in the tutorial, but rather would upload them one by one separately to make sure no other unrelated packages get uploaded (for instance python -m wheel generates wheels for dependencies for packages with C extensions). Or at least carefully check that folder first. So some logic that does the correct version filtering by default without relying on bash globing might help. As poetry publish does I imagine.

@henryiii
Copy link
Contributor

henryiii commented Apr 29, 2021

If PyPi could build the wheel automatically for pure Python packages

This would be a mess; there are pure Python packages that depend on the Python version to generate a wheel, for example; so it it was done "automatically", then the wheel would only work on one version of Python. Most packages do provide wheels; the ones that don't usually have a reason they are working on it. black didn't have a wheel for the previous version, but that's because they deleted the wheel since there was a possible problem with it (package and module both present). Missing a wheel when it's easy to add one does happen, but it's not that bad - most tutorials make wheels too, and that's where it counts.

twine upload without a path

Most tools (pip (?), build, gh-action-pypi-publish) assume dist/* here if no path is given, so seems to be a reasonable request (to be opened in twine). But twine upload dist/* isn't that hard to type either, and is in all the tutorials have it that way (I think).

@henryiii
Copy link
Contributor

henryiii commented Apr 29, 2021

for instance python -m wheel generates wheels for dependencies for packages

That's because this is being misused, you should use build instead, which does not generate extra packages, because it's not setting up a wheelhouse, but instead is building packages.

You can do twine upload dist/packagename-* if you want. Twine won't know what you are trying to upload from dist either if you make it a wheelhouse.

@henryiii
Copy link
Contributor

generates wheels for dependencies for packages with C extensions

Not exactly, it generates wheels for any dependencies that do not have wheels already uploaded for your system. So if numpy doesn't have a wheel for your arch, then it will build NumPy and put the wheel in the wheelhouse. If it's already on PyPI for your system, it will just use PyPI and will not add it to the built wheels in the wheelhouse. The wheelhouse + PyPI (or whatever index you are using) should be enough to install the current package without building anything.

@rth
Copy link
Author

rth commented Apr 29, 2021

Thanks for the clarifications!

This would be a mess

Yeah, I realize it's a complex subject, I'm clearly not suggesting it in this issue.

That's because this is being misused, you should use build instead

Then https://packaging.python.org/guides/distributing-packages-using-setuptools/#wheels needs updating.

But twine upload dist/* isn't that hard to type either,

It's not about ease of use but about confidence it won't upload any unrelated wheels (or versions) that happen to be in that folder. And also making it harder from a UI perspective not to upload a wheel it if it's already generated by build.

@rth
Copy link
Author

rth commented Apr 29, 2021

Then packaging.python.org/guides/distributing-packages-using-setuptools/#wheels needs updating.

Also unless I'm mistaken that page doesn't mention the build tool at all. I get that the path is "packaging with setuptools" but the title is very general "Packaging and distributing projects" and I think users would often end up there and not on the tutorial when searching.

@henryiii
Copy link
Contributor

henryiii commented Apr 29, 2021

Then ... needs updating

I agree. And there is too much duplication in packaging.python.org, IMO...

So some logic that does the correct version filtering by default without relying on bash globing might help. As poetry publish does I imagine.

poetry publish reads pyproject.toml and pulls the name from there. Twine can't do that - what would you read? pyproject.toml PEP 621 metadata? setup.cfg? setup.py? You can't read a python file reliably, and running it can have side effects. Twine can upload files from other sources, too, not just setuptools, and can run from any directory. Due to the fact it is a general tool, I don't think you can make it much "smarter" than it currently is. Maybe a smart default based on PEP 621 would be interesting, though - something to discuss over on the twine issues.

@astrojuanlu
Copy link

what would you read? pyproject.toml PEP 621 metadata? setup.cfg? setup.py? You can't read a python file reliably, and running it can have side effects.

My understanding is that https://github.com/pypa/pep517/ provides such unified interface. In fact, that's what pip-tools uses now to universally extract requirements from local packages, without the need of choosing which file to parse (have a look at jazzband/pip-tools#1311, as well as this comment pypa/setuptools#1951 (comment)).

@henryiii
Copy link
Contributor

Can someone open an issue in twine with the suggestion for the "smart" default, using pep517.meta.load, then?

@layday
Copy link
Member

layday commented Apr 29, 2021

pep517 provides an interface for extracting metadata from wheels. The question that remains is, how do you ascertain that the package name from the wheel is the same as in your setup.cfg or pyproject.toml or who knows what without rebuilding the wheel?

It's not about ease of use but about confidence it won't upload any unrelated wheels (or versions) that happen to be in that folder. And also making it harder from a UI perspective not to upload a wheel it if it's already generated by build.

If *-globbing is too crude, you could search for distributions which begin with the name of your project: twine publish dist/my-project-*.{tar.gz,whl}. See also pypa/build#198 which attempts to address this in a different way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants