Adding a recursive xLARFT #1080

jprhyne · 2024-11-30T09:48:24Z

Hey Y'all,

I've been working on a recursive implementation of the xLARFT routines. I have some figures below that show some increased performance for the DLARFT routine on its own as well as some performance graphs for when we use the DLARFT subroutine within DGEQRF.

One can argue that instead of using DGEQR2+DLARFT in DGEQRF one should use DGEQRT3 which is true, but DLARFT is useful in many other settings where a routine akin to DGEQRT3 does not exist.

Long story short: when we increase the block size of our routine calls (K in xLARFT), we see some drastic improvement in the performance of DLARFT. The performance improvement for DGEQRF are less impressive since DLARFT (with a reasonably small block size) represents a small portion of DGEQRF.

For the experiments, I used the HPC at my university, and to give context to the actual performance values, I have
included a figure that highlights the performance for square matrix-matrix multiplication.

flopMMM.pdf

newBlocksizePerf.pdf
Note: I previously attached an incorrect file for the above. Please disregard. Sorry for any confusion!

varyBlocksizeDlarftPerf.pdf

And if there is an interest in the methodology for how these graphs were generated or anything along
those lines, my development of these routines can be found in my xor-xungr repository

The main algorithm for the recursive xLARFT recurses down to the tree node (the case of N=1 or K=1), and
not much investigation was done about if an earlier bailout (to the old method or another one) could
increase performance, so this is something that could be investigated but through some brief toying around on
my machine, I did not see any real difference when using optimized BLAS (AOCL)

Since this method will be changing how xLARFT is being done, I also moved the existing version as a variant titled
'LL-LVL2'.

I appreciate any feedback on this approach or anything else done!
Checklist

The documentation has been updated.
If the PR solves a specific issue, it is set to be closed on merge.

…moved the previous version into VARIANTS

Larft

Merge branch 'Reference-LAPACK:master' into larft

langou · 2024-11-30T10:12:26Z

SRC/dormqr.f

@@ -309,8 +309,7 @@ SUBROUTINE DORMQR( SIDE, TRANS, M, N, K, A, LDA, TAU, C, LDC,
 *           Form the triangular factor of the block reflector
 *           H = H(i) H(i+1) . . . H(i+ib-1)
 *
-            CALL DLARFT( 'Forward', 'Columnwise', NQ-I+1, IB, A( I,
-     $                   I ),
+            CALL DLARFT( 'Forward', 'Columnwise', NQ-I+1, IB, A( I, I ),


I think that works, the last comma of the line (the one after A(I,I)) seems to be the 72nd character of the line, so I think that changes work. I am more worried than anything about these types of changes. See #1079 where @hjjvandam proposes a fix to get some lines below 72 characters when we add _64 here and there.

I think it is best to leave the PR focused on LARFT. so please do not commit changes on GELQF, GERQF, etc. Only [S,D,,C,Z]LARFT. If you want to do a separate commit to improve some formatting in some routines here and there, sure. But please separate.

Good point, I didn't think about the extensions. So in the failings that I have elsewhere, I'll keep the line length at or below 68 and hopefully that'll resolve those issues.

Thanks for the point to the PR!

langou · 2024-11-30T10:13:18Z

SRC/la_xisnan.mod

I do not think we want the .mod file

Is this something that could potentially be added to a gitignore file as I notice the file is created after I run 'make' so it seems to be a product of some compilation process. Not sure if its because of some machine tests that are failing (things like testing NaNs and Max behavior)

"Is this something that could potentially be added to a gitignore file." Good idea. Feel welcome to submit a separate pull request with this feature.

langou · 2024-11-30T10:13:24Z

SRC/la_constants.mod

I do not think we want the .mod file

angsch · 2024-11-30T15:03:30Z

SRC/VARIANTS/larft/LL-LVL2/clarft.f

+*> \author Univ. of Colorado Denver
+*> \author NAG Ltd.
+*
+*> \ingroup larft


If the old version shall live in VARIANTS, should the group be updated for the documentation? The other files in this folder carry "variant" in the name. Also, could you please add the variants to the build system? SRC/VARIANTS/Makefile and SRC/VARIANTS/README.

Thank you for the build system catch! It should be in there for the most recent commit I've made. I'm also hoping that I ironed out all the line length related issues (mentioned above)

jprhyne added 16 commits May 29, 2024 17:28

a

8a338cf

Merge branch 'master' of github.com:jprhyne/lapack

7fdd346

Merge branch 'master' of github.com:jprhyne/lapack

d1f787c

Merge branch 'master' of github.com:jprhyne/lapack

4490848

DO NOT MERGE: demonstrating changes work

2122708

CAN MERGE: Implemented my version of xlarft with comments added, and …

5495628

…moved the previous version into VARIANTS

Merge branch 'Reference-LAPACK:master' into master

13aab4a

Merge branch 'Reference-LAPACK:master' into larft

298804e

Merge pull request #1 from jprhyne/larft

b966220

Larft

updating parameter definition in the single complex version

1ba075c

Merge branch 'Reference-LAPACK:master' into master

46e8388

Merge branch 'Reference-LAPACK:master' into master

828db43

Merge branch 'Reference-LAPACK:master' into larft

3065ee8

Merge pull request #2 from jprhyne/larft

60c66af

Merge branch 'Reference-LAPACK:master' into larft

updating documentation to be more descriptive

2534b59

Merge branch 'master' of github.com:jprhyne/lapack

dadd80e

langou reviewed Nov 30, 2024

View reviewed changes

SRC/la_constants.mod Outdated

Copy link

Contributor

langou Nov 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think we want the .mod file

jprhyne added 3 commits November 30, 2024 08:49

Removed mod files and extranous file changes (hopefully)

354a16f

removed extranous changes (hopefully x2)

273ab49

removed all extranous changes

d4741c8

angsch reviewed Nov 30, 2024

View reviewed changes

jprhyne added 2 commits November 30, 2024 11:05

lowered line length to hopefully fix build failures in the CI

db48820

Updated variants information as well as fixed trailing line in zlarft

e9b05ef

langou approved these changes Nov 30, 2024

View reviewed changes

jprhyne mentioned this pull request Dec 2, 2024

updating gitignore to ignore the mod files when we compile #1082

Merged

2 tasks

langou merged commit 6ec7f2b into Reference-LAPACK:master Dec 3, 2024
12 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding a recursive xLARFT #1080

Adding a recursive xLARFT #1080

jprhyne commented Nov 30, 2024 •

edited

Loading

langou Nov 30, 2024

langou Nov 30, 2024

jprhyne Nov 30, 2024

langou Nov 30, 2024

jprhyne Nov 30, 2024

langou Nov 30, 2024

langou Nov 30, 2024

angsch Nov 30, 2024

jprhyne Nov 30, 2024

Adding a recursive xLARFT #1080

Adding a recursive xLARFT #1080

Conversation

jprhyne commented Nov 30, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jprhyne commented Nov 30, 2024 •

edited

Loading