You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently the runspersub option requires the user to make compensating modifications to the walltime requested to ensure the multiple number of runs can complete within a single PBS submit.
This has been a source of confusion in the past.
ACCESS-NRI is working up configurations for ACCESS-ESM1.5 and this model has a maximum run-time of 1 year. However it is a low-res ESM model that typically requires very long runs to equilibrate slow carbon cycling. It would be convenient to have runspersub: 20 to minimise PBS queue time and a proliferation of PBS logs.
However this would mean the default configuration would have a PBS walltime of 48hrs. For users doing short test runs this would impact their movement through the queue.
The proposal is to alter the logic so that walltime is set by the user to reflect how long it takes for a single run of the model. Then runspersub and the number of runs requested could be used to modify the requested walltime to make sure the job can complete (basically submit_walltime = min(runs, runspersub) * walltime).
This has a nice feature that runspersub can be left set to a larger number, and however many runs a user selects the submitted wall time would be adjusted up to a maximum value that is runspersub * walltime.
Clearly this would require useful informative messages to the user to let them know how the PBS submission was being altered.
There is some precedence here with the way payu pads CPU requests to be a multiple of nodes, or sets memory limits if no memory is set.
If backwards compatibility was required, or if it was clearer for users, there could be a new config option runtime which is then used to calculate walltime if walltime isn't specified.
The text was updated successfully, but these errors were encountered:
If backwards compatibility was required, or if it was clearer for users, there could be a new config option runtime which is then used to calculate walltime if walltime isn't specified.
If we did this it might make sense to have runtime and walltime mutually exclusive. So use either one or the other, and by default with runspersub: 1 they would have identical practical outcomes.
Calculating the final walltime for users still requires users to be aware of the maximum walltime of the queue they're using. If they modify the model config so it takes longer for a single run they would need to change runtime and runspersub otherwise they may exceed the maximum walltime of the queue. Which would waste a lot of resources.
If maxwalltime was defined in the platform config, and set to the known defaults in payu then it would just require changing runtime and payu could check the current settings were consistent.
Currently the
runspersub
option requires the user to make compensating modifications to thewalltime
requested to ensure the multiple number of runs can complete within a single PBS submit.This has been a source of confusion in the past.
ACCESS-NRI is working up configurations for ACCESS-ESM1.5 and this model has a maximum run-time of 1 year. However it is a low-res ESM model that typically requires very long runs to equilibrate slow carbon cycling. It would be convenient to have
runspersub: 20
to minimise PBS queue time and a proliferation of PBS logs.However this would mean the default configuration would have a PBS walltime of 48hrs. For users doing short test runs this would impact their movement through the queue.
The proposal is to alter the logic so that
walltime
is set by the user to reflect how long it takes for a single run of the model. Thenrunspersub
and the number of runs requested could be used to modify the requested walltime to make sure the job can complete (basicallysubmit_walltime
=min(runs, runspersub)
*walltime
).This has a nice feature that
runspersub
can be left set to a larger number, and however many runs a user selects the submitted wall time would be adjusted up to a maximum value that isrunspersub
*walltime
.Clearly this would require useful informative messages to the user to let them know how the PBS submission was being altered.
There is some precedence here with the way
payu
pads CPU requests to be a multiple of nodes, or sets memory limits if no memory is set.If backwards compatibility was required, or if it was clearer for users, there could be a new config option
runtime
which is then used to calculatewalltime
ifwalltime
isn't specified.The text was updated successfully, but these errors were encountered: