Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom PA Partition size 256 to improve performance #238

Merged
merged 6 commits into from
Oct 22, 2024

Conversation

sanyalington
Copy link

Add an option to control partition size in Custom PA launcher.
Change default Partition Size to 256 to improve performance across products/ cases.
Change CPA reduction kernel template switch case to support 128K context length with Partition Size 256.

@@ -316,12 +316,14 @@ __global__ __launch_bounds__(NUM_THREADS) void paged_attention_ll4mi_QKV_kernel(
int vphysical_blocks[VBLOCKS];

const int warp_start_block_idx = warp_start_token_idx / BLOCK_SIZE;
if constexpr (GQA_RATIO < 12) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will not block this merge because of this, but please consider turning this magic 12 into a named constant in a subsequent

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noted, thanks!

@gshtras gshtras merged commit 1eefd1e into main Oct 22, 2024
16 of 17 checks passed
@gshtras gshtras deleted the shsanyal_devpa_partition_size branch October 22, 2024 14:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants