You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The same bug that was present in AWS Neuron SDK 2.19 and fixed in 2.19.1 (#91) is back in AWS Neuron SDK 2.20.
With AWS Neuron SDK 2.19, when exporting a model and saving the compiled artifacts, it is impossible to reload them afterwards if the python path is different.
This basically makes shared serialization and caching impossible, since you cannot control the deployment environment (ec2 with DLAMI, sagemaker or ad-hoc end-user endpoints will all have different environments).
FileNotFoundError: Could not find a matching NEFF foryour HLOin this directory. Ensure that the model you are trying to load is the same type and has the same parameters as the one you saved or call "save" on this model to reserialize it.
The same bug that was present in AWS Neuron SDK 2.19 and fixed in 2.19.1 (#91) is back in AWS Neuron SDK 2.20.
With AWS Neuron SDK 2.19, when exporting a model and saving the compiled artifacts, it is impossible to reload them afterwards if the python path is different.
This basically makes shared serialization and caching impossible, since you cannot control the deployment environment (ec2 with DLAMI, sagemaker or ad-hoc end-user endpoints will all have different environments).
Steps to reproduce
download test_tnx_llama_export.py
export the model in a venv
You should get the following exception:
$ python test_tnx_llama_export.py export meta-llama/Llama-3.1-8B-Instruct --save_dir ./llama-bar
Now if you compare the NEFF files in the two save dir you will see that one of them is different.
The text was updated successfully, but these errors were encountered: