-
Notifications
You must be signed in to change notification settings - Fork 329
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
请问下可以支持llama和bloom推理加速吗 #502
Comments
It is not supported currently. |
It will be supported in May, and it is expected that V100-32G can be deployed. |
@hexisyztem Hi, can flash attention be used on V100? |
As you can see in https://github.com/HazyResearch/flash-attention, flash
attention doesn't support V100.
From: ***@***.***>
Date: Mon, Jun 12, 2023, 17:32
Subject: [External] Re: [bytedance/lightseq] 请问下可以支持llama和bloom推理加速吗 (Issue
#502)
To: ***@***.***>
Cc: ***@***.***>, "Mention"<
***@***.***>
@hexisyztem <https://github.com/hexisyztem> Hi, can flash attention be used
on V100?
—
Reply to this email directly, view it on GitHub
<#502 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AGAOAOACCGYKLGTXC26OSE3XK3OYRANCNFSM6AAAAAAXCPO7VA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
No description provided.
The text was updated successfully, but these errors were encountered: