Skip to content

Update sdpa function with enable_gqa=True#191

Open
jainapurva wants to merge 2 commits into
mainfrom
gqa_support
Open

Update sdpa function with enable_gqa=True#191
jainapurva wants to merge 2 commits into
mainfrom
gqa_support

Conversation

@jainapurva

Copy link
Copy Markdown

For the llama model, in the sdpa function call, set enable_gqa=True to use the inbuilt grouped query attention functionality

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 13, 2024
@jainapurva jainapurva requested a review from drisspg July 13, 2024 03:56
@yanboliang

yanboliang commented Jul 26, 2024

Copy link
Copy Markdown
Contributor

I think we should wait a bit to get this in, since a lot of users are still using the old version of PT which doesn't support enable_gqa. But I'm interested how much perf gain we have after enabling the builtin gqa, do you have numbers on A100?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants