-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slow response from Windows minions on 2018.3 #48882
Comments
ping @saltstack/team-windows any ideas here? |
Some ideas:
|
This is an issue with the way Python forks processes on Windows. |
My understanding is that there is a time out. And that the minion is meant to response with "I am working on it please wait" I suspect this is not happening on windows? As we should not need to use the -t option. See also #48470. A better solution maybe not to fork at all but to call something like "salt-call" which reads grains etc from a file create when the minion started, if this is quicker and pass the results back. |
We are having the same issues on Windows.
We had to increase For 2 VM on the same host to answer a One difference in my case is that the minions are running Python 3 on Windows |
Been dealing with this behavior on Windows for a while now... I have yet to see anything to improve response times |
I've been testing salt for primarily windows environment for a few days. x64 Py3 windows minion 2018.3.2
|
It sounds a lot like adding path exceptions to Windows Defender might stop on-access scanning within those paths, but isn't excluding processes launched from other types of scanning. You might get similar symptoms from poor disk throughput though. Are you using shared storage for the system drive? |
Current testing is limited to a pair of VMs running on virtualbox (master and a server 2016 minion). Those are just being backed by a consumer grade ssd. |
How is your CPU allocated to the VMs? I imagine that if you are using a type 2 hypervisor you could end up with some weird latency / timeout issues if CPU cores are over committed (i.e. you have 4 CPU cores available, and both of these VMs are assigned 4 CPU cores). |
4 Cores available - 1 assigned to salt master, 2 assigned to windows server VM |
Very slow response on Win2016 from Azure. Must increase timeout to 120s, otherwise Minion did not return. [No response] error appears. |
I have an idea on how to fix it. i.e. stop forking for find_job. In salt/minion.py
Could not get the above to work i.e. needs someone who knows this code better. Try 2 was in _handle_decoded_payload
The above resulting in single fork instead of two on linux. Looking at the code Windows in general appears to have catches to try and stop it forking as much as Linux does which is a lot faster at forking. Option 1 would be best for Windows. Can someone who knows minion.py better, have a look at this. Maybe yourself @Ch3LL |
Any update? Do you plan to fix it? Working with more than 2 minions on Win2016 is impossible because of timeouts. |
salt-call is faster on windows than going via the salt-master. It should be the number one priority for Windows. Unfortunately I do not known the code will enough in salt/minion.py to fix it. I believe the fix is to pull the find_job code into minion.py so not fork or extra thread is not required for a response. Maybe @cachedout can have a go. I could not get #48882 (comment) to not fork or create a thread. I could not work out how to do a direct return of data. |
Why not use multi threading ? That should be much faster as most of the code would already be loaded |
Threading seems to be expense as well.
The three important lines, note the 10 second gap.
First number is the PID, the next number is line-number.
|
Also keep in mine that find_job is called about every 5 to 15 seconds while a request is running and it takes about 10 seconds for it to be processed. |
More digging on windows. Seems salt.loader.grains(opts) is called on every request. I suspect the more grains the slower this will be. On Linux this only runs once at startup. I suspect minions need to save "grains" collected to a file using say msgpack and reload it. And may be refresh the grains once an hour ????? As more grains are added to windows will get slower and slower.
|
But salt-minion is not multi-threaded afaik, any command to run requires a new process to be created (from scratch as fork() is not supported) |
|
The work around. I assume if used you need to tell the minion to refresh the grains
|
fork() in python works. Its just very slow. Fork is the default on Unix/Linux. It looks like it uses threads as the default for Windows. However the issues is the grains are being refresh every time on Windows vs at startup on Linux/Unix. More Windows grains the slower it will get. (esp. if they user powershell which is slow to start. I developed cmd.shell_info to reduce the number of times powershell was called just to get its version info) |
On a side note it would be still be better if |
I have minion with no grains and responses are very slowly so it's not a grain issue. I trigger state files from salt master. |
All minions have grains. i.e. the Core Grains like IP, OS Version etc. in the log above they are taking just under 10 seconds. |
Ok, sorry, no custom grains ;) |
Basically what is happening. You run a long running task or Hence my 2c also says find_job needs to be inbuilt into minion.py to reduce its overhead on all platforms. Which is 2nd issues. The first issue is grains are collected every time a request of anything is made. |
@saltstack/team-core Long term I would like to see grains data and config data cached in msgpack (or plain json text file). |
no help. Minion restarted after config changes. |
Try things like PS use highlight the text and click |
time salt -G os:Windows disk.usage
root@SaltMaster:~# time salt -G os:Windows cmd.run "dir \Windows /s"
|
That's good news assuming UK box has the grains_cache setting and the other does not. |
No, it's not. State files still respond very slowly, even if all the tasks are done before and state file should only check current status and do nothing. Example:
2nd time:
State file:
|
win_iis & win_servermanager calls powershell, it will be slow. PowerShell takes 10 seconds to start. i.e. 10 seconds to find out the current state of the settings before deciding if action is required as it does not match the required state. So I am not surprised it took this time. 9 x 10sec = 90+seconds
There will be a small performance improvement, could not tell you if it will be noticeable or not. |
PS you can use a grain instead of |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. If this issue is closed prematurely, please leave a comment and we will gladly reopen the issue. |
IMHO this was not resolved, windows responses are still much slower than linux boxes |
The issues was that grains were being run everytime, when grains should only be run at startup (or when you use salt-call). This has been resolve about year ago now. I suggest you upgrade. |
Description of Issue/Question
The issue is described and supposedly fixed, but I seem to be having the same problem for a long time now. #27866
8 sec might seem nothing but try to debugrun some simple command 60 times in a minute with +8 sec delay added. This is a major pain compared to the older puppet system I managed.
Setup
Salt 2018.3.2 as master and 2018.3.1 as client on Win2016
Py 2.7 both.
Steps to Reproduce Issue
time salt "client" test.ping
client:
True
real 0m8.364s
user 0m0.952s
sys 0m0.112s
After changing client config "multiprocessing: False"
But I have read this is not recommended and even config has warnings. So What am I supposed to do, besides looking for alternatives to salt itself?
client:
True
real 0m1.160s
user 0m0.808s
sys 0m0.136s
Master "client" Ubuntu 16.04 has been always fast. No need to change anything.
master:
True
real 0m1.291s
user 0m0.812s
sys 0m0.136s
Versions Report
Salt Version:
Salt: 2018.3.2
Dependency Versions:
dateutil: 2.4.2
Jinja2: 2.8
msgpack-python: 0.4.6
pycrypto: 2.6.1
Python: 2.7.12 (default, Dec 4 2017, 14:50:18)
python-gnupg: 0.3.8
PyYAML: 3.11
PyZMQ: 15.2.0
Tornado: 4.2.1
ZMQ: 4.1.4
System Versions:
version: Ubuntu 16.04 xenial
The text was updated successfully, but these errors were encountered: