You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
DGEMM performs one of the matrix-matrix operations
C := alpha*op( A ) * op( B ) + beta*C
Assuming this is computed using IEEE rules, if alpha is zero and A or B contain a NaN or +/-Inf, then the result should contain NaNs, as 0.0 * Inf == NaN and 0.0 * NaN== NaN. The same goes if beta is zero and C contains a NaN or +/-Inf.
Issue 1
As far as I can tell from a cursory look, this is not how ?GEMM actually behaves in the Netlib implementation.
Due to the early return at line 271 of DGEMM, if alpha is zero and beta is one, NaNs or Infs in A or B do not affect the result.
If both alpha and beta are zero, then a zero matrix is returned (line 279), irrespective of any NaNs or Infs in A, B or C.
If alpha is zero and beta is neither zero or one, then beta*C is returned, irrespective of any NaNs or Infs in A or B.
If only beta is zero, alpha*op( A )*op( B ) is returned, irrespective of any NaNs or Infs in C.
Therefore there are inconsistencies between the behavior expected from reading the docs and actual program behavior. These are arguably corner cases, but NaN propagation is an important mechanism, and callers should be made aware if a given subroutine ignores NaN inputs under certain circumstances.
One example of this mattering, is when one of the matrices may contain uninitialized data, leading to Issue#2
Issue 2
The documentation does not make it clear enough when it is OK to call ?GEMM with matrices containing uninitialized data.
For example, the simplest use case for ?GEMM is computing C = A*B. The current documentation states that:
When BETA is supplied as zero then C need not be set on input.
and
Before entry, the leading m by n part of the array C must contain the matrix C, except when beta is zero, in which case C need not be set on entry.
However depending on a person's interpretation of what it means for C to be "set", this may fail to convey that if beta is zero, then C is permitted to be contain arbitrary bytes (including ones that resolve to NaNs or Infs) before entry. As a result, callers may unnecessarily zero-fill C before calling ?GEMM.
The same is also true for the other corner cases where alpha is zero, where both A and B are allowed to be filled with arbitrary bytes before entry without any effect on the output. Since the current docs do not mention any special behavior for alpha == 0.0, one would incorrectly assume that A and B have to be initialized to guarantee correct results, even if alpha is zero.
Suggested fixes
Change the docs to point out that ?GEMM does not compute C := alpha*op( A )*op( B ) + beta*C exactly as written, and as a consequence does not adhere to all of the NaN and Inf propagation rules that the formula implies.
Document the corner cases.
Add notes about when the matrices are permitted to contain arbitrary bytes before entry.
Checklist
I've included a minimal example to reproduce the issue
I'd be willing to make a PR to solve this issue
The text was updated successfully, but these errors were encountered:
TiborGY
changed the title
?GEMM documentation fuzzy on NaN handling when alpha or beta is zero
?GEMM documentation is fuzzy on NaN handling when alpha or beta is zero
Nov 26, 2024
(*) You are correct about the behavior of GEMM in the presence of alpha = 0 or beta = 0 and in the presence of NaNs/Infs in the matrices
(*) I think your two main points: (*) the sentence "C needs not be set on entry" is not clear, and (*) the behavior for alpha = 0.0 is not described. These are fair points. We can improve the documentation. Maybe "C needs not be initialized on entry."
(*) We have a report out at https://arxiv.org/pdf/2207.09281 where we looked at NaN propagation in the BLAS and LAPACK. What you mention for GEMM is in Section 2.3.1 "How to interpret alpha = 0 or beta = 0 in C = alpha*A*B + beta*C". See as well, the related Correctness'22 paper cited below.
Cheers,
Julien.
James Demmel, Jack Dongarra, Mark Gates, Greg Henry, Julien Langou, Xiaoye Li, Piotr Luszczek, Weslley Pereira, Jason Riedy and Cindy Rubio-González. Proposed Consistent Exception Handling for the BLAS and LAPACK. In Sixth International Workshop on Software Correctness for HPC Applications (Correctness 2022), a workshop of ACM/IEEE SC 2022 Conference (SC'22), Dallas, TX, USA, November 13-18, 2022. https://doi.org/10.1109/Correctness56720.2022.00006
Preamble
The descriptions for ?GEMM start as:
Assuming this is computed using IEEE rules, if
alpha
is zero andA
orB
contain aNaN
or+/-Inf
, then the result should containNaN
s, as0.0 * Inf == NaN
and0.0 * NaN== NaN
. The same goes ifbeta
is zero andC
contains aNaN
or+/-Inf
.Issue 1
As far as I can tell from a cursory look, this is not how ?GEMM actually behaves in the Netlib implementation.
alpha
is zero andbeta
is one,NaN
s orInf
s inA
orB
do not affect the result.alpha
andbeta
are zero, then a zero matrix is returned (line 279), irrespective of anyNaN
s orInf
s inA
,B
orC
.alpha
is zero andbeta
is neither zero or one, thenbeta*C
is returned, irrespective of anyNaN
s orInf
s inA
orB
.beta
is zero,alpha*op( A )*op( B )
is returned, irrespective of anyNaN
s orInf
s inC
.Therefore there are inconsistencies between the behavior expected from reading the docs and actual program behavior. These are arguably corner cases, but
NaN
propagation is an important mechanism, and callers should be made aware if a given subroutine ignoresNaN
inputs under certain circumstances.One example of this mattering, is when one of the matrices may contain uninitialized data, leading to Issue#2
Issue 2
The documentation does not make it clear enough when it is OK to call ?GEMM with matrices containing uninitialized data.
For example, the simplest use case for ?GEMM is computing
C = A*B
. The current documentation states that:and
However depending on a person's interpretation of what it means for
C
to be "set", this may fail to convey that ifbeta
is zero, thenC
is permitted to be contain arbitrary bytes (including ones that resolve toNaN
s orInf
s) before entry. As a result, callers may unnecessarily zero-fillC
before calling ?GEMM.The same is also true for the other corner cases where
alpha
is zero, where bothA
andB
are allowed to be filled with arbitrary bytes before entry without any effect on the output. Since the current docs do not mention any special behavior foralpha == 0.0
, one would incorrectly assume thatA
andB
have to be initialized to guarantee correct results, even ifalpha
is zero.Suggested fixes
C := alpha*op( A )*op( B ) + beta*C
exactly as written, and as a consequence does not adhere to all of theNaN
andInf
propagation rules that the formula implies.Checklist
The text was updated successfully, but these errors were encountered: