Load Imbalance #1

jrhaberstroh · 2013-05-24T20:57:44Z

The NVT equilibration (using cluster_equilibrate) of FMO currently suffers from 400% load imbalance on 72 cores -- 12 for PME. Cause not known.

jrhaberstroh · 2013-05-26T03:24:49Z

This question branches into more general optimization questions.

How should I balance PME order and PME cutoff? How many PME threads to use?

What is constraints = all-bonds about, and how should I modify lincs-order and lincs-iter to allow for greater parallelization?

What is the maximum number of threads I can use for a certain simulation, and how can I estimate it?

What is thread-MPI vs. MPI? OpenMP vs. MPI?

What is the verlet cutoff scheme vs. group scheme? Group uses the "charge groups" defined in the MD file, and is very efficient with water. Verlet is new, and works with CUDA and OpenMP.

jrhaberstroh · 2013-05-26T03:30:53Z

Bringing 12 PME cores to 24 PME cores brought 400% to 200%, but an additional increase to 48 PME cores (and 48 MD cores) had no additional benefit.

Changing pme-cutoff from .16 to .32 had no effect.
Changing pme-order from 4 to 10 had no effect.
Turning "free-energy=yes" to "free-energy=no" brought the imbalance to 2%, fixing the issue but leaving me confused. Is it more costly to use the B-state parameters? Is there a way to switch to those parameters without paying this imbalance?
"init-lambda" is no better than "init-lamdba-state".

jrhaberstroh · 2013-06-11T19:45:26Z

With an average imbalance of 330%:

NOTE: 37.3 % of the available CPU time was lost due to load imbalance
in the domain decomposition.

NOTE: 9.7 % performance was lost because the PME nodes
had less work to do than the PP nodes.
You might want to decrease the number of PME nodes
or decrease the cut-off and the grid spacing.

Maybe the nstdhdl=10 is forcing [-gcom 10]. But this does not make sense because the load imbalance persists even when equilibrating an excited state system, where nstdhdl = 0...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Load Imbalance #1

Load Imbalance #1

jrhaberstroh commented May 24, 2013

jrhaberstroh commented May 26, 2013

jrhaberstroh commented May 26, 2013

jrhaberstroh commented Jun 11, 2013

Load Imbalance #1

Load Imbalance #1

Comments

jrhaberstroh commented May 24, 2013

jrhaberstroh commented May 26, 2013

jrhaberstroh commented May 26, 2013

jrhaberstroh commented Jun 11, 2013