Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kill pkg_resources for CLI tools [DO NOT MERGE] #38167

Merged
merged 4 commits into from
Dec 22, 2016

Conversation

cachedout
Copy link
Contributor

@cachedout cachedout commented Dec 8, 2016

This is an attempt to solve what's becoming a major performance problem in the Salt CLI toolchain.

What's happening here is that CLI tools such as salt or salt-call have been getting slower and slower. As it turns out, this is due to the pkg_resources modules from the setuptools package.

The root issue is that any import of setuptools will cause Python to traverse the entire Python path, searching for eggs and packages and comparing versions and whatnot. The entry point for this is in the _initialize_master_working_set() function inside the __init__ file for pkg_resources in the setuptools package.

As one might expect, this cost is only incurred on the initial import of pkg_resources. Since it's in the global namespace thereafter, the traversal does not occur on subsequent attempts.

The impacts of this certainly vary based on how many files are in one's Python path, but on a typical machine, the hit is severe. It's between 1.5s and 1.8s to do this. Making things worse, this happens before any Salt code can even run so it effectively imposes a nearly two-second delay on any CLI command. This is simply unacceptable.

This issue is being tracked over in the setuptools repo, here: pypa/setuptools#510

I have commented on that issue here: pypa/setuptools#510 (comment)

I have tried several methods for working around this issue in Salt. Attempting to use mock to monkey-patch the issue does not work because mock itself imports pkg_resources! (Gah!)

One cannot monkey-patch it in a traditional way by replacing the function (ala pkg_resources._initialize_master_working_set = lambda x: 'No soup for you') because one must first import pkg_resources to do so and of course by the time you do this, it's already too late. :-/

It would certainly be great if the setuptools folks gave us a way to disable this behavior ahead of time, through a flag or an environment variable but c'est la vie.

So, as far as Salt goes -- you'll see that I've used a very large hammer to work around this. In effectively disabling the import of pkg_resources, we drop the execution time for our CLI toolchain from ~2s to around 0.3s. Obviously, a significant improvement.

I've attempted to limit this drama to just the CLI tools since the cases these files should only be imported by the CLI wrappers themselves and we shouldn't see this leak into the running daemons.

Some very non-trivial caveats:

  • By putting this in salt-call, we could destroy all manner of things. Stuff which relies on the proper behavior of pkg_resources could be broken. This could be nothing, or it could totally break requests, for example. I don't know right now. Putting it just in the salt CLI tool would be safer of course but having salt-call perform so badly isn't great.

  • Even with this in salt, we risk breaking outputters which could potentially depend on pkg_resources.

That said, I realize this solution is likely not pratical but wanted to raise the issue here and see what ideas come back. Any and all suggestions and feedback are welcome.

Attached is some profiling data for salt-call --local test.ping which illustrates the issue. One can use runsnake to examine this with a GUI or whatever tool one likes to look through cProfile data. :]

patch_profile.zip

Refs #38071

@cachedout
Copy link
Contributor Author

Go Go Jenkins!

Copy link
Collaborator

@s0undt3ch s0undt3ch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding your concern with outputters. Do they load when starting the CLI?
Or when returning data?
If only when returning data, then you could remove your pkg_resources 💣 before loading the returners?

@cachedout
Copy link
Contributor Author

cachedout commented Dec 9, 2016

@s0undt3ch I could but then it would reintroduce the issue because pkg_resources just blocks all execution while Python goes on its merry well to stat every single file in the Python path. :-/

Copy link
Contributor

@terminalmage terminalmage left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with this as a temporary measure, but we need to look into the possible implications. The salt CLI command should not be impacted much because (and correct me if I am wrong) the LocalClient will just prod the daemon to carry out the job, and the daemon will not be affected. I'm more concerned with salt-call (and especially masterless salt-call).

@cachedout
Copy link
Contributor Author

@terminalmage I think it's probably wise to err on the side of caution and remove this from salt-call for now. We are starting to discuss possible solutions on the upstream side of this. See the issue I linked in the original description for details.

Copy link
Contributor

@DmitryKuzmenko DmitryKuzmenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good for me.
Great job, @cachedout!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants