You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The cost of communication seems more prevalent when using Symmetric Eigen solvers (PXSYEVD) with 1xp grids. How should the grid be distributed when NPROCS is prime? How can I force the system to keep some nodes idle in this case? Which of the layers would need modifications? ScaLAPACK, PBLAS or BLACS? Any insights in this area would be helpful.
The text was updated successfully, but these errors were encountered:
What if I am forced to create a 1xP grid for maximum utility of the system's hardware? The grid parameters are set by the application and I cannot tweak them.
I'm trying to launch the application under two scenarios
-np 189, with grid params 9x21(PxQ)
-np 191 (max possible in the target system), with grid params 1x191(PxQ)
I see considerable degradation in performance with the second case. However, I would expect improvement with increase in number of processes? Is there any layer of the ScaLAPACK I can exploit to resolve this case when I encounter a prime?
Well, you could choose block sizes that are so large that the last proc does not hold anything, but that also seems sub-optimal. I would suggest you reduce your comm-size. If 189 performs much better than 191, then that is obviously using your resources much better.
The cost of communication seems more prevalent when using Symmetric Eigen solvers (PXSYEVD) with 1xp grids. How should the grid be distributed when NPROCS is prime? How can I force the system to keep some nodes idle in this case? Which of the layers would need modifications? ScaLAPACK, PBLAS or BLACS? Any insights in this area would be helpful.
The text was updated successfully, but these errors were encountered: