Replies: 1 comment
-
It is in the paged layout with page size = 1 |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
As for this topic, I'm wondering if SGLang supports the page attention kernel. In the Triton backend, it appears to be using flash attention. Is there any reason why SGLang chooses flash attention over page attention?
Beta Was this translation helpful? Give feedback.
All reactions