Enable Llama 3.1 405B in FP8 (#124) #1745

jaygala223 · 2025-02-05T15:35:07Z

add changes for fix
add keep_moduleon_host, modify quant json
remove buffer check
add llama 405 checks
remove hardcoded path, reuse module on host check
fix: undefined variable
remove unused import

What does this PR do?

Fixes # (issue)

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?

* add changes for fix * add keepmoduleonhosst, modfy quant json * remove buffer check * add llama 405 checks * remove hardcoded path, reuse module on host check * fix: undefined variable * remove unused import --------- Co-authored-by: Your Name <[email protected]>

regisss · 2025-02-07T10:45:45Z

examples/text-generation/utils.py

@@ -429,8 +429,13 @@ def setup_distributed_model(args, model_dtype, model_kwargs, logger):

    logger.info("DeepSpeed is enabled.")
    deepspeed.init_distributed(dist_backend="hccl")
-    config = AutoConfig.from_pretrained(args.model_name_or_path, torch_dtype=model_dtype, **model_kwargs)


We cannot remove torch_dtype=model_dtype here and then use config.torch_dtype afterwards. Because we won't be able to specify anymore if we want to run the model in fp32 or in bf16.
Why do you need to do that?

jaygala223 requested a review from regisss as a code owner February 5, 2025 15:35

libinta added run-test Run CI for PRs from external contributors and removed run-test Run CI for PRs from external contributors labels Feb 5, 2025

jiminha approved these changes Feb 6, 2025

View reviewed changes

jiminha added the run-test Run CI for PRs from external contributors label Feb 7, 2025

regisss reviewed Feb 7, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable Llama 3.1 405B in FP8 (#124) #1745

Enable Llama 3.1 405B in FP8 (#124) #1745

jaygala223 commented Feb 5, 2025 •

edited

Loading

regisss Feb 7, 2025

Enable Llama 3.1 405B in FP8 (#124) #1745

Are you sure you want to change the base?

Enable Llama 3.1 405B in FP8 (#124) #1745

Conversation

jaygala223 commented Feb 5, 2025 • edited Loading

What does this PR do?

Before submitting

regisss Feb 7, 2025

Choose a reason for hiding this comment

jaygala223 commented Feb 5, 2025 •

edited

Loading