Refactor custom gemm heuristics #56

gshtras · 2024-06-19T18:26:18Z

Moving custom skinny gemm heuristic before hipblas or rocblas solutions.
Disabling the now obsolete LLMM1 path which is fully covered by the new kernel

…ns. Disabling the now obsolete LLMM1 path

mawong-amd

Mostly looks good assuming performance testing shows no regressions.

vllm/model_executor/layers/tuned_gemm.py

mawong-amd · 2024-06-19T18:30:20Z

vllm/model_executor/layers/tuned_gemm.py

+                            weights.shape[0],
+                            dtype=inp_view.dtype,
+                            device='cuda')
+        _custom_C.wvSpltK(weights, inp_view, out, n, self.cu_count)


Not something that needs to be changed right now but we probably want to refactor this eventually so that the MP core count is done at the C++ level: IMO not good decomposition to have it here.

vllm/model_executor/layers/tuned_gemm.py

… separately

vllm/model_executor/layers/linear.py

mawong-amd

Looks good!

Moving custom skinni gemm heuristic before hipblas or rocblas solutio…

2ba7ab3

…ns. Disabling the now obsolete LLMM1 path

gshtras requested a review from mawong-amd June 19, 2024 18:26

Simplified the decision logic

f75f670

gshtras requested a review from dllehr-amd June 19, 2024 18:32

mawong-amd reviewed Jun 19, 2024

View reviewed changes

Added back one case when LLMM1 can be used. Defaulting to adding bias…

aba49c6

… separately

gshtras requested a review from mawong-amd June 20, 2024 17:09

gshtras added 4 commits June 20, 2024 17:30

Moved bias addition inside tgemm

fb920fa

yapf

f7936ca

Calling the right function

ed5d93c

ruff

5ff9c6c

mawong-amd reviewed Jun 20, 2024

View reviewed changes

vllm/model_executor/layers/linear.py Show resolved Hide resolved

Update tuned_gemm.py

486f78d

mawong-amd approved these changes Jun 20, 2024

View reviewed changes

gshtras merged commit 4460294 into main Jun 20, 2024
13 checks passed

gshtras deleted the refactor_custom_gemm_heuristic branch June 20, 2024 22:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor custom gemm heuristics #56

Refactor custom gemm heuristics #56

gshtras commented Jun 19, 2024

mawong-amd left a comment

mawong-amd Jun 19, 2024

mawong-amd left a comment

Refactor custom gemm heuristics #56

Refactor custom gemm heuristics #56

Conversation

gshtras commented Jun 19, 2024

mawong-amd left a comment

Choose a reason for hiding this comment

mawong-amd Jun 19, 2024

Choose a reason for hiding this comment

mawong-amd left a comment

Choose a reason for hiding this comment