?GEMM documentation is fuzzy on NaN handling when alpha or beta is zero #1077

TiborGY · 2024-11-26T21:16:50Z

Preamble

The descriptions for ?GEMM start as:

DGEMM performs one of the matrix-matrix operations
C := alpha*op( A ) * op( B ) + beta*C

Assuming this is computed using IEEE rules, if alpha is zero and A or B contain a NaN or +/-Inf, then the result should contain NaNs, as 0.0 * Inf == NaN and 0.0 * NaN== NaN. The same goes if beta is zero and C contains a NaN or +/-Inf.

Issue 1

As far as I can tell from a cursory look, this is not how ?GEMM actually behaves in the Netlib implementation.

Due to the early return at line 271 of DGEMM, if alpha is zero and beta is one, NaNs or Infs in A or B do not affect the result.
If both alpha and beta are zero, then a zero matrix is returned (line 279), irrespective of any NaNs or Infs in A, B or C.
If alpha is zero and beta is neither zero or one, then beta*C is returned, irrespective of any NaNs or Infs in A or B.
If only beta is zero, alpha*op( A )*op( B ) is returned, irrespective of any NaNs or Infs in C.

Therefore there are inconsistencies between the behavior expected from reading the docs and actual program behavior. These are arguably corner cases, but NaN propagation is an important mechanism, and callers should be made aware if a given subroutine ignores NaN inputs under certain circumstances.
One example of this mattering, is when one of the matrices may contain uninitialized data, leading to Issue#2

Issue 2

The documentation does not make it clear enough when it is OK to call ?GEMM with matrices containing uninitialized data.
For example, the simplest use case for ?GEMM is computing C = A*B. The current documentation states that:

When BETA is supplied as zero then C need not be set on input.

and

Before entry, the leading m by n part of the array C must contain the matrix C, except when beta is zero, in which case C need not be set on entry.

However depending on a person's interpretation of what it means for C to be "set", this may fail to convey that if beta is zero, then C is permitted to be contain arbitrary bytes (including ones that resolve to NaNs or Infs) before entry. As a result, callers may unnecessarily zero-fill C before calling ?GEMM.

The same is also true for the other corner cases where alpha is zero, where both A and B are allowed to be filled with arbitrary bytes before entry without any effect on the output. Since the current docs do not mention any special behavior for alpha == 0.0, one would incorrectly assume that A and B have to be initialized to guarantee correct results, even if alpha is zero.

Suggested fixes

Change the docs to point out that ?GEMM does not compute C := alpha*op( A )*op( B ) + beta*C exactly as written, and as a consequence does not adhere to all of the NaN and Inf propagation rules that the formula implies.
Document the corner cases.
Add notes about when the matrices are permitted to contain arbitrary bytes before entry.

Checklist

I've included a minimal example to reproduce the issue
I'd be willing to make a PR to solve this issue

The text was updated successfully, but these errors were encountered:

langou · 2024-11-27T08:36:23Z

Hi @TiborGY,

(*) You are correct about the behavior of GEMM in the presence of alpha = 0 or beta = 0 and in the presence of NaNs/Infs in the matrices

(*) I think your two main points: (*) the sentence "C needs not be set on entry" is not clear, and (*) the behavior for alpha = 0.0 is not described. These are fair points. We can improve the documentation. Maybe "C needs not be initialized on entry."

(*) We have a report out at https://arxiv.org/pdf/2207.09281 where we looked at NaN propagation in the BLAS and LAPACK. What you mention for GEMM is in Section 2.3.1 "How to interpret alpha = 0 or beta = 0 in C = alpha*A*B + beta*C". See as well, the related Correctness'22 paper cited below.

Cheers,
Julien.

James Demmel, Jack Dongarra, Mark Gates, Greg Henry, Julien Langou, Xiaoye Li, Piotr Luszczek, Weslley Pereira, Jason Riedy and Cindy Rubio-González. Proposed Consistent Exception Handling for the BLAS and LAPACK. In Sixth International Workshop on Software Correctness for HPC Applications (Correctness 2022), a workshop of ACM/IEEE SC 2022 Conference (SC'22), Dallas, TX, USA, November 13-18, 2022. https://doi.org/10.1109/Correctness56720.2022.00006

TiborGY · 2024-11-27T14:55:49Z

Thanks for reading my post Julien.
I'll whip up a PR for adding these to the docs. It was surprising behavior, at least for me.

Best,
Tibor

TiborGY added the Type: Bug label Nov 26, 2024

TiborGY changed the title ~~?GEMM documentation fuzzy on NaN handling when alpha or beta is zero~~ ?GEMM documentation is fuzzy on NaN handling when alpha or beta is zero Nov 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

?GEMM documentation is fuzzy on NaN handling when alpha or beta is zero #1077

?GEMM documentation is fuzzy on NaN handling when alpha or beta is zero #1077

TiborGY commented Nov 26, 2024 •

edited

Loading

langou commented Nov 27, 2024 •

edited

Loading

TiborGY commented Nov 27, 2024

?GEMM documentation is fuzzy on NaN handling when alpha or beta is zero #1077

?GEMM documentation is fuzzy on NaN handling when alpha or beta is zero #1077

Comments

TiborGY commented Nov 26, 2024 • edited Loading

Preamble

Issue 1

Issue 2

Suggested fixes

langou commented Nov 27, 2024 • edited Loading

TiborGY commented Nov 27, 2024

TiborGY commented Nov 26, 2024 •

edited

Loading

langou commented Nov 27, 2024 •

edited

Loading