You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
in a multi-worker run e.g. on a cluster using slurm, is there a good way to free some resources, i.e., stopping/killing a worker without losing the result of the parameter combination the worker was testing? Currently, if I just kill a worker, the result for the corresponding parameter combination would just be lost and the next free worker would not continue or restart the parameter combination of the killed worker. Is there a way to kill a worker and the next free worker would just restart or continue the job of the killed worker?
Thanks
Thomas
The text was updated successfully, but these errors were encountered:
Hey there,
in a multi-worker run e.g. on a cluster using slurm, is there a good way to free some resources, i.e., stopping/killing a worker without losing the result of the parameter combination the worker was testing? Currently, if I just kill a worker, the result for the corresponding parameter combination would just be lost and the next free worker would not continue or restart the parameter combination of the killed worker. Is there a way to kill a worker and the next free worker would just restart or continue the job of the killed worker?
Thanks
Thomas
The text was updated successfully, but these errors were encountered: