Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

?GEMM documentation is fuzzy on NaN handling when alpha or beta is zero #1077

Open
1 of 2 tasks
TiborGY opened this issue Nov 26, 2024 · 2 comments
Open
1 of 2 tasks

Comments

@TiborGY
Copy link

TiborGY commented Nov 26, 2024

Preamble

The descriptions for ?GEMM start as:

DGEMM performs one of the matrix-matrix operations
C := alpha*op( A ) * op( B ) + beta*C

Assuming this is computed using IEEE rules, if alpha is zero and A or B contain a NaN or +/-Inf, then the result should contain NaNs, as 0.0 * Inf == NaN and 0.0 * NaN== NaN. The same goes if beta is zero and C contains a NaN or +/-Inf.

Issue 1

As far as I can tell from a cursory look, this is not how ?GEMM actually behaves in the Netlib implementation.

  1. Due to the early return at line 271 of DGEMM, if alpha is zero and beta is one, NaNs or Infs in A or B do not affect the result.
  2. If both alpha and beta are zero, then a zero matrix is returned (line 279), irrespective of any NaNs or Infs in A, B or C.
  3. If alpha is zero and beta is neither zero or one, then beta*C is returned, irrespective of any NaNs or Infs in A or B.
  4. If only beta is zero, alpha*op( A )*op( B ) is returned, irrespective of any NaNs or Infs in C.

Therefore there are inconsistencies between the behavior expected from reading the docs and actual program behavior. These are arguably corner cases, but NaN propagation is an important mechanism, and callers should be made aware if a given subroutine ignores NaN inputs under certain circumstances.
One example of this mattering, is when one of the matrices may contain uninitialized data, leading to Issue#2

Issue 2

The documentation does not make it clear enough when it is OK to call ?GEMM with matrices containing uninitialized data.
For example, the simplest use case for ?GEMM is computing C = A*B. The current documentation states that:

When BETA is supplied as zero then C need not be set on input.

and

Before entry, the leading m by n part of the array C must contain the matrix C, except when beta is zero, in which case C need not be set on entry.

However depending on a person's interpretation of what it means for C to be "set", this may fail to convey that if beta is zero, then C is permitted to be contain arbitrary bytes (including ones that resolve to NaNs or Infs) before entry. As a result, callers may unnecessarily zero-fill C before calling ?GEMM.

The same is also true for the other corner cases where alpha is zero, where both A and B are allowed to be filled with arbitrary bytes before entry without any effect on the output. Since the current docs do not mention any special behavior for alpha == 0.0, one would incorrectly assume that A and B have to be initialized to guarantee correct results, even if alpha is zero.

Suggested fixes

  1. Change the docs to point out that ?GEMM does not compute C := alpha*op( A )*op( B ) + beta*C exactly as written, and as a consequence does not adhere to all of the NaN and Inf propagation rules that the formula implies.
  2. Document the corner cases.
  3. Add notes about when the matrices are permitted to contain arbitrary bytes before entry.

Checklist

  • I've included a minimal example to reproduce the issue
  • I'd be willing to make a PR to solve this issue
@TiborGY TiborGY changed the title ?GEMM documentation fuzzy on NaN handling when alpha or beta is zero ?GEMM documentation is fuzzy on NaN handling when alpha or beta is zero Nov 26, 2024
@langou
Copy link
Contributor

langou commented Nov 27, 2024

Hi @TiborGY,

(*) You are correct about the behavior of GEMM in the presence of alpha = 0 or beta = 0 and in the presence of NaNs/Infs in the matrices

(*) I think your two main points: (*) the sentence "C needs not be set on entry" is not clear, and (*) the behavior for alpha = 0.0 is not described. These are fair points. We can improve the documentation. Maybe "C needs not be initialized on entry."

(*) We have a report out at https://arxiv.org/pdf/2207.09281 where we looked at NaN propagation in the BLAS and LAPACK. What you mention for GEMM is in Section 2.3.1 "How to interpret alpha = 0 or beta = 0 in C = alpha*A*B + beta*C". See as well, the related Correctness'22 paper cited below.

Cheers,
Julien.

James Demmel, Jack Dongarra, Mark Gates, Greg Henry, Julien Langou, Xiaoye Li, Piotr Luszczek, Weslley Pereira, Jason Riedy and Cindy Rubio-González. Proposed Consistent Exception Handling for the BLAS and LAPACK. In Sixth International Workshop on Software Correctness for HPC Applications (Correctness 2022), a workshop of ACM/IEEE SC 2022 Conference (SC'22), Dallas, TX, USA, November 13-18, 2022. https://doi.org/10.1109/Correctness56720.2022.00006

@TiborGY
Copy link
Author

TiborGY commented Nov 27, 2024

Thanks for reading my post Julien.
I'll whip up a PR for adding these to the docs. It was surprising behavior, at least for me.

Best,
Tibor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants