Is expert parallelism steady? And will it be supported for DeepSeek-R1? #3241
dmakhervaks
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Please see title.
Running on 2x8H100s. Batch inference is very slow, so trying to find a way to speed it up - any ideas?
Beta Was this translation helpful? Give feedback.
All reactions