Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve start-up time by deferring slow imports. #6

Closed
wants to merge 2 commits into from

Conversation

kalekundert
Copy link

The configurator package takes 200-300 ms to import, which is enough to cause a noticeable pause when a program is starting up. Almost all of this time is spent importing pkg_resources:

$ python -X importtime -c 'import configurator'
$ python -X importtime -c 'import pkg_resources'

By deferring this import until it is needed, configurator can start up in about 7 ms—much faster. This is really nice if the command being executed doesn't actually need to read a config file, e.g. if it's just displaying usage help or something.

The `configurator` package takes 200-300 ms to import, which is enough to
cause a noticeable pause when a program is starting up.  Almost all of
this time is spent importing `pkg_resources`:

$ python -X importtime -c 'import configurator'
$ python -X importtime -c 'import pkg_resources'

By defering this import until it is needed, `configurator` can start up
in about 7 ms---much faster.  This is really nice if the command being
executed doesn't actually need to read a config file, e.g. if it's just
displaying usage help or something.
Copy link
Member

@cjw296 cjw296 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All seems very pragmatic, if you wanted to take it further, you could see about only loading each library when it's actually used.

Let me know if you want to take a stab at that instead, if not, also let me know and I'll merge this and crank out a release.

@@ -12,6 +10,9 @@ class Parsers(dict):

@classmethod
def from_entrypoints(cls):
# `pkg_resources` is slow to import, so defer until we need it.
from pkg_resources import iter_entry_points
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit ugly. Why is important pkg_resources so slow? Have you reported the issue upstream at https://github.com/pypa/setuptools?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a bit ugly. I don't know why it's so slow, but there are already a couple issues open relating to this: pypa/setuptools#510 pypa/setuptools#926.

@kalekundert
Copy link
Author

I added another commit to defer a few more imports, specifically copy and pprint. This is only a small improvement: it brings this import time down from ~7 ms to ~2 ms. But both modules are only used in one place, so it's not disruptive to defer them. pprint in particular is fairly unlikely to actually be needed, since it's only used in ConfigNode.__repr__(), which is more of a debugging tool.

I don't think it's worth doing anything more thorough than this, since there's really not much time left to save. Go ahead and merge if everything looks good to you.

Thanks for writing this library, by the way. I just found it a few weeks ago. In the past I'd write my own half-baked code every time I need to merge config files, but this is much nicer.

@cjw296
Copy link
Member

cjw296 commented Apr 11, 2020

I have to be honest, I'm having second thoughts about this PR. I'm really not a fan local imports, particularly those in a9f8e04. I'm also curious about the actual win here: if a Configurator is instantiated, then you're going to have to pay the cost of these imports, unless I'm missing something? I would hope that use of whatever you're building is more more than the number of times --help is run?

How about having the import of configurator in your library be moved to where it's instantiated?
Would that give you the same performance win?

Apologies for all the questions, trying to get to the right solution :-)

@kalekundert
Copy link
Author

No worries. My application is a command line tool with a number of different subcommands, many of which don't read any configuration files. I was trying to make the commands a little more snappy, so I profiled the imports and noticed that almost all of the import time was devoted to configurator, entirely because it imports pkg_resources. Obviously this is a waste of time for the commands that don't read the config files. But even for the commands that do, there's a subtle problem: if I started the command, then immediately decided to Ctrl-C it for some reason (e.g. I noticed an argument was wrong), I would often get a configurator/pkg_resources stack trace despite the fact that my main function has a nice handler for keyboard interrupts.

None of this is a big problem for me; as you point out, I can import configurator locally in my own project so that only the subcommands that need it pay the price. This is a bit of a gotcha, though, which is why I thought it'd be better to address in configurator itself.

Feel free to revert a9f8e04 (or I can do it, if that would be easier). I was just trying to address your comment about loading each library when it's used, but the performance improvement is negligible. I do think that pprint probably should be a local import, but it doesn't really matter.

@cjw296
Copy link
Member

cjw296 commented Apr 14, 2020

I'm starting to think that allowing parsers to be provided by setuptools entrypoints might just be a bad idea :-/

@cjw296
Copy link
Member

cjw296 commented Apr 15, 2020

Can you test 2.0.0? My hope is this will have much better performance, even when you use it!

@cjw296 cjw296 closed this Apr 15, 2020
@kalekundert
Copy link
Author

2.0.0 seems much faster. Thanks for taking the time to work on this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants