Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using entry_points makes scuba slow #71

Closed
JonathonReinhart opened this issue Aug 2, 2016 · 8 comments
Closed

Using entry_points makes scuba slow #71

JonathonReinhart opened this issue Aug 2, 2016 · 8 comments
Milestone

Comments

@JonathonReinhart
Copy link
Owner

While working on Bash completion #46, I noticed a delay as the completion script invoked scuba to get the aliases:

When installed (using pip install . to /usr/bin/scuba):

$ time scuba --list-aliases
real    0m0.222s
user    0m0.199s
sys 0m0.023s

When running from source:

$ time python -m scuba --list-aliases
real    0m0.090s
user    0m0.081s
sys 0m0.009s

Of course, 90 ms to get the aliases could be improved, but what accounts for this ~130 ms difference in the installed version?

@JonathonReinhart
Copy link
Owner Author

The analysis was done with this data: scuba_list_aliases.zip, generated by:

python -m cProfile -o scuba_list_aliases.prof /usr/bin/scuba --list-aliases

Using snakeviz, I see that most of this was due to import pkg_resources, which took 196ms by itself:
snakeviz: import pkg_resources

And another 31 ms was spent on pkg_resources.load_entry_point:
snakeviz: pkg_resources.load_entry_point

Why is this so slow?

@JonathonReinhart
Copy link
Owner Author

Upstream issue: pypa/setuptools#510

@ninjaaron
Copy link

ninjaaron commented Aug 4, 2016

I wrote a tiny module called fastentrypoints that monkey patches the mechanism behind entry_points to generate scripts that don't import pkg_resources.

https://github.com/ninjaaron/fast-entry_points

@JonathonReinhart
Copy link
Owner Author

Thanks @ninjaaron! I'll have a look at that.

@JonathonReinhart
Copy link
Owner Author

JonathonReinhart commented Aug 7, 2016

Interesting: I noticed after installing scuba from testpypi using pip (version 8.1.2), that the bash completion was noticeably (and measurably) faster.

It turns out that when installing a wheel (bdist_wheel) using pip, the generated script doesn't use pkg_resources:

#!/usr/bin/python

# -*- coding: utf-8 -*-
import re
import sys

from scuba.__main__ import main

if __name__ == '__main__':
    sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
    sys.exit(main())

Now I read (pypa/pip#2874 (comment)):

Since 7.0.0 sdist are converted to wheels before being installed. As a result, a package is never setup.py install'ed but always bdist_wheel'ed first.

I confirmed that installing via pip from an sdist, this is true:

$ python setup.py sdist
$ sudo pip install dist/scuba-2.0.0a1.tar.gz 
Processing ./dist/scuba-2.0.0a1.tar.gz
Requirement already satisfied (use --upgrade to upgrade): PyYAML in /usr/lib64/python2.7/site-packages (from scuba==2.0.0a1)
Building wheels for collected packages: scuba
  Running setup.py bdist_wheel for scuba ... done
  Stored in directory: /root/.cache/pip/wheels/8f/e6/72/c9f35b565133e57c796d19a4534116670428b3c9ab0470f3bc
Successfully built scuba
Installing collected packages: scuba
Successfully installed scuba-2.0.0a1

But that doesn't appear to always be the case.

sdist from pypi

Installing the latest release 1.7.0 (which was only released as an sdist, see #71 (comment)),

$ sudo pip install --no-cache scuba==1.7.0
Collecting scuba==1.7.0
  Downloading scuba-1.7.0.tar.gz
Requirement already satisfied (use --upgrade to upgrade): PyYAML in /usr/lib64/python2.7/site-packages (from scuba==1.7.0)
Installing collected packages: scuba
  Running setup.py install for scuba ... done
Successfully installed scuba-1.7.0
$ cat /usr/bin/scuba 
#!/usr/bin/python
# EASY-INSTALL-ENTRY-SCRIPT: 'scuba==1.7.0','console_scripts','scuba'
__requires__ = 'scuba==1.7.0'
import sys
from pkg_resources import load_entry_point

if __name__ == '__main__':
    sys.exit(
        load_entry_point('scuba==1.7.0', 'console_scripts', 'scuba')()
    )

pip install direct from source

If I install this from source using sudo pip install . the old pkg_resources entrypoint is generated. So it's almost as if installing from a directory using pip somehow uses the old setup.py (setuptools) behavior.

pip install local .whl

However, doing the following:

$ python setup.py bdist_wheel
$ sudo pip install dist/scuba-2.0.0a1-py2.py3-none-any.whl

gives the desired behavior.

...Sigh, the relationship between all of these projects is complex.

@JonathonReinhart
Copy link
Owner Author

JonathonReinhart commented Aug 7, 2016

Since my .travis-ci.yml builds a bdist_wheel, and the installation instructions say to install with pip, this issue shouldn't affect most users.

The only users that would be affected would be those installing directly from source (i.e. the Git repo). For those users, forcing a wheel installation will ensure the faster entry point:

$ sudo pip install wheel
$ python setup.py bdist_wheel
$ sudo pip install dist/scuba-....whl

Update: It looks like only my sdist distributions are being built and sent to pypi. See this build log from my 1.7.0 tag. And only a source distribution is available on pypi for 1.7.0. 😢 Why isn't the bdist_wheel in my .travis-ci.yml being built? I have an email out to [email protected] about this, as I can't find anything on the web about it.

In the meantime, this works:

$ sudo pip install wheel
$ pip wheel scuba                     # Download scuba from pypi and build a wheel from it
$ sudo pip install ./scuba-....whl    # Install local wheel archive

Update 2:
It turns out there was an indentation error in my .travis.yml file; my distributions key was accidentally under the deploy.on section. This is fixed in 9117d3c. So my original statement about this only affecting users who install directly from sources is now actually true.

@ninjaaron
Copy link

ninjaaron commented Aug 7, 2016

Yes, I've also been reading about the relationship of this horrible entry_point script and wheels. I've begun building my packages with pip wheel, which I guess is similar to what you're doing. As you've mentioned, there are many corner cases where setup.py still ends up being run at install. Another issue is that most Linux (and I assume homebrew) package maintainers seem to still be in the habit of using setup.py to install.

To keep users from getting the slow entry_points script no matter how they install, I'm putting this after the bit where I import setuptools in all my packages with entry_points:

try:
    from urllib import request
except ImportError:
    import urllib2 as request
fastep = request.urlopen('https://raw.githubusercontent.com/ninjaaron/fast-entry_points/master/fastentrypoints.py')
namespace = {}
exec(fastep.read(), namespace)

It copies the monkey patch into memory from my repo and execs it (in it's own namespace, of course). You could try that, or fork the repo to ensure that you don't pull some kind of exploit code or... I don't know. I realize this solution is kind of crazy, but it's better than waiting an extra 400ms for pkg_resources to load.

JonathonReinhart added a commit that referenced this issue Aug 7, 2016
This is the reason why my bdist_wheel packages weren't being built.

See: #71
JonathonReinhart added a commit that referenced this issue Aug 7, 2016
This tests that scuba behaves as expected when installed as a wheel,
which is needed for performance (see #71).
JonathonReinhart added a commit that referenced this issue Aug 7, 2016
This tests that scuba behaves as expected when installed as a wheel,
which is needed for performance (see #71).
@JonathonReinhart
Copy link
Owner Author

Now that Travis is pushing wheels out to pypi, and I've updated the README for the unlikely source installations, I'm considering this issue resolved.

Thanks for your feedback and moral support, @ninjaaron!

@JonathonReinhart JonathonReinhart added this to the v2.0 milestone Aug 7, 2016
aodj added a commit to multiplechoice/scrapy-sqs-exporter that referenced this issue Jun 7, 2017
After some searching, it turns out that the `distributions` key was put within the `on` key (which makes no sense) so that any changes to it would not be detected, since the value of `distributions` was defaulting to just "sdist". Thanks to this guy: JonathonReinhart/scuba#71 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants