You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)
Reproduction (minimal, reproducible, runnable)
$ optimum-cli export onnx \
--framework pt \
--model microsoft/deberta-v3-base \
--task text-classification \
output-dir
Some weights of DebertaV2ForSequenceClassification were not initialized from the model checkpoint at microsoft/deberta-v3-base and are newly initialized: ['classifier.bias', 'classifier.weight', 'pooler.dense.bias', 'pooler.dense.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
/home/marcovalenzuelaescarcega/.virtualenvs/onnx/lib/python3.11/site-packages/transformers/convert_slow_tokenizer.py:558: UserWarning: The sentencepiece tokenizer that you are converting to a fast tokenizer uses the byte fallback option which is not implemented in the fast tokenizers. In practice this means that the fast version of the tokenizer can produce unknown tokens whereas the sentencepiece version would have converted these unknown tokens into a sequence of byte tokens matching the original piece of text.
warnings.warn(
/home/marcovalenzuelaescarcega/.virtualenvs/onnx/lib/python3.11/site-packages/transformers/models/deberta_v2/modeling_deberta_v2.py:547: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
torch.tensor(mid - 1).type_as(relative_pos),
/home/marcovalenzuelaescarcega/.virtualenvs/onnx/lib/python3.11/site-packages/transformers/models/deberta_v2/modeling_deberta_v2.py:551: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
torch.ceil(torch.log(abs_pos / mid) / torch.log(torch.tensor((max_position - 1) / mid)) * (mid - 1)) + mid
/home/marcovalenzuelaescarcega/.virtualenvs/onnx/lib/python3.11/site-packages/transformers/models/deberta_v2/modeling_deberta_v2.py:710: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
scale = torch.sqrt(torch.tensor(query_layer.size(-1), dtype=torch.float) * scale_factor)
/home/marcovalenzuelaescarcega/.virtualenvs/onnx/lib/python3.11/site-packages/transformers/models/deberta_v2/modeling_deberta_v2.py:710: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
scale = torch.sqrt(torch.tensor(query_layer.size(-1), dtype=torch.float) * scale_factor)
/home/marcovalenzuelaescarcega/.virtualenvs/onnx/lib/python3.11/site-packages/transformers/models/deberta_v2/modeling_deberta_v2.py:785: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
scale = torch.sqrt(torch.tensor(pos_key_layer.size(-1), dtype=torch.float) * scale_factor)
/home/marcovalenzuelaescarcega/.virtualenvs/onnx/lib/python3.11/site-packages/transformers/models/deberta_v2/modeling_deberta_v2.py:785: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
scale = torch.sqrt(torch.tensor(pos_key_layer.size(-1), dtype=torch.float) * scale_factor)
/home/marcovalenzuelaescarcega/.virtualenvs/onnx/lib/python3.11/site-packages/transformers/models/deberta_v2/modeling_deberta_v2.py:797: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
scale = torch.sqrt(torch.tensor(pos_query_layer.size(-1), dtype=torch.float) * scale_factor)
/home/marcovalenzuelaescarcega/.virtualenvs/onnx/lib/python3.11/site-packages/transformers/models/deberta_v2/modeling_deberta_v2.py:797: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
scale = torch.sqrt(torch.tensor(pos_query_layer.size(-1), dtype=torch.float) * scale_factor)
/home/marcovalenzuelaescarcega/.virtualenvs/onnx/lib/python3.11/site-packages/transformers/models/deberta_v2/modeling_deberta_v2.py:798: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if key_layer.size(-2) != query_layer.size(-2):
/home/marcovalenzuelaescarcega/.virtualenvs/onnx/lib/python3.11/site-packages/transformers/models/deberta_v2/modeling_deberta_v2.py:105: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
output = input.masked_fill(rmask, torch.tensor(torch.finfo(input.dtype).min))
Expected behavior
I am trying to export my own fine-tuned DeBERTaV3 model and get those TracerWarning. Here I am using microsoft/deberta-v3-base as an example, so please ignore the warning about some weights not being initialized. The real problem are those TracerWarning, because an ONNX model is actually being generated but it is incorrect. It seems to always predict the same label.
I also see similar TracerWarning with microsoft/deberta-v2-xlarge, and some (less) TracerWarning with microsoft/deberta-base, but I haven't checked if those converted models are incorrect too.
If optimum-cli export onnx doesn't support DeBERTaV3 then I would appreciate a pointer to how to do the conversion from code, or any other possible solution.
Thanks!
The text was updated successfully, but these errors were encountered:
System Info
$ pip freeze | grep optimum optimum==1.23.1 $ python -V Python 3.11.2 $ lsb_release -a No LSB modules are available. Distributor ID: Debian Description: Debian GNU/Linux 12 (bookworm) Release: 12 Codename: bookworm
Who can help?
@michaelbenayoun
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction (minimal, reproducible, runnable)
Expected behavior
I am trying to export my own fine-tuned DeBERTaV3 model and get those
TracerWarning
. Here I am usingmicrosoft/deberta-v3-base
as an example, so please ignore the warning about some weights not being initialized. The real problem are thoseTracerWarning
, because an ONNX model is actually being generated but it is incorrect. It seems to always predict the same label.I also see similar
TracerWarning
withmicrosoft/deberta-v2-xlarge
, and some (less)TracerWarning
withmicrosoft/deberta-base
, but I haven't checked if those converted models are incorrect too.If
optimum-cli export onnx
doesn't support DeBERTaV3 then I would appreciate a pointer to how to do the conversion from code, or any other possible solution.Thanks!
The text was updated successfully, but these errors were encountered: