Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

module load <clusterName> sets environment variables that override values in the sbatch submission script. #223

Closed
gwolski opened this issue Apr 13, 2024 · 2 comments · Fixed by #224
Assignees

Comments

@gwolski
Copy link

gwolski commented Apr 13, 2024

According to the slurm sbatch documentation:

NOTE: Environment variables will override any options set in a batch script, and command line options will override any environment variables.

The module load feature sets certain variables that I don't think it should. I would suggest we not set any of the variables - this should be left to the user or their scripts. I use scripts to submit with #SBATCH lines and they are ignored if environment variables are set.

The one that gave me grief today is the
SBATCH_MEM_PER_NODE=100M

I would suggest we not set these, rather document their existence for users to modify if they want.
Here are the set variables I found ( I replaced my cluster name with ):

SLURM_CPUS_PER_TASK=1
SLURM_TIMELIMIT=1:0:0
SLURM_CPUS_PER_TASK_SET=
SLURM_CLUSTER_NAME=
SLURM_TIMELIMIT_SET=
SLURM_CONF=/opt/slurm//etc/slurm.conf
SLURM_MEM_PER_NODE_SET=
SLURM_PARTITION=batch
SLURM_PARTITION_SET=

SBATCH_PARTITION=batch
SBATCH_MEM_PER_NODE_SET=
SBATCH_TIMELIMIT_SET=
SBATCH_PARTITION_SET=
SBATCH_REQUEUE_SET=
SBATCH_TIMELIMIT=1:0:0
SBATCH_REQUEUE=
SBATCH_MEM_PER_NODE=100M

The setting of these variables is not documented on the documentation pages, nor how to to modify them. Can we modify them somehow as part of the source that generates the module files?

I found them set in

/opt/slurm//config/modules/modulefiles/Rocky/8/x86_64//3.8.0

and I believe the source is in:

./source/resources/playbooks/roles/ParallelClusterHeadNode/templates/opt/slurm/modules/modulefiles/slurm/.template

@cartalla cartalla self-assigned this Apr 24, 2024
@cartalla
Copy link
Contributor

From https://slurm.schedmd.com/sbatch.html

NOTE: Environment variables will override any options set in a batch script, and command line options will override any environment variables.

My intent was to make sure that defaults exist, but not to override variables set in the batch files. I will take those out of the modulefile and see if I can come up with a better way to making sure that defaults are specified. Probably the most important one is the time limit because you don't want to allow users to submit a job and then have it hang and consume resources until someone notices.

@gwolski
Copy link
Author

gwolski commented Apr 24, 2024

I would argue it's not our job to protect the user from these mistakes. They will notice in time and probably only make the mistake once.

Unfortunately I think it's a poor design choice on the part of slurm that environment variables override script variables. I would have designed it so the order of precedence is environment variables, script variables, command line variables. This allows for defaults, a script to have better values, and if someone needs to do a one-off to try something, the command line for sbatch can override things. But then again, they didn't ask me and I've unable to glean their intent for this choice.

You get one vote from me to not have defaults.

cartalla added a commit that referenced this issue Apr 29, 2024
These override batch file settings and are only overridden by command line arguments so this isn't a good place to set defaults.

Resolves #223
@cartalla cartalla linked a pull request Apr 29, 2024 that will close this issue
cartalla added a commit that referenced this issue Apr 29, 2024
These override batch file settings and are only overridden by command line arguments so this isn't a good place to set defaults.

Resolves #223
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants