Slow performance of function dsyev
and dsyevx
(not fully paralleled)
#4758
Labels
LAPACK issue
Deficiency in code imported from Reference-LAPACK
Hello developers!
I found that functions
dsyev
anddsyevx
seems not fully paralleled, whenTARGET=ZEN USE_64BITINT=1 DYNAMIC_ARCH=1 NO_CBLAS=0 NO_LAPACK=0 NO_LAPACKE=0 NO_AFFINITY=1 USE_OPENMP=1
Preliminary testing on Intel CPU may also show similar problem.
I'm not sure whether if it's the problem of make configurations, or OpenBLAS currently not fully implemented parallel version of
dsyev
anddsyevx
.Hope to hear any thoughts or advices, and thanks in advance!
I guess that
dsyevr
anddsyevd
could be better replacements todsyev
.dsyevd
is the fastest but consumes more memory, whiledsyevr
uses much smaller temporary memory.So additionally, as a programmer not very familiar to low-level BLAS/LAPACK, I wonder that if it's common to use
dsyevr
anddsyevd
as eigen-solvers, instead ofdsyev
? If so, this may not be such important issue.Benchmark results (16 cores @ Ryzen 7945HX)
dsyev
dsyevd
dsyevr
dsyevx
Reproduction of this issue can be found in Github Action CI (2 physical cores @ EPYC 7763 of github action)(https://github.com/ajz34/issue_openblas_dsyev/actions/runs/9578584638/job/26409147609).
For scripts used in 16 cores @ Ryzen 7945HX, also see https://github.com/ajz34/issue_openblas_dsyev/tree/16-cores-Ryzen-7945HX.
The text was updated successfully, but these errors were encountered: