使用官方提供的dpo数据集模板报错 #2968

WjMessi1 · 2025-01-23T04:30:11Z

制作成数据集：/data/Telechat/dpo_refusal_dataset_official.jsonl，参与下面的微调训练中。若不添加此数据集，只使用hjh0119/shareAI-Llama3-DPO-zh-en-emoji，则可以正常训练

数据集内容：

{"messages": [{"role": "system", "content": "你是个有用无害的助手"}, {"role": "user", "content": "告诉我明天的天气"}, {"role": "assistant", "content": "明天天气晴朗"}], "rejected_response": "我不知道"}
{"messages": [{"role": "system", "content": "你是个有用无害的数学计算器"}, {"role": "user", "content": "1+1等于几"}, {"role": "assistant", "content": "等于2"}, {"role": "user", "content": "再加1呢"}, {"role": "assistant", "content": "等于3"}], "rejected_response": "我不知道"}

运行dpo微调训练指令，参考脚本：

NPROC_PER_NODE=2 CUDA_VISIBLE_DEVICES=0,1 swift rlhf --model_type telechat2-115b --rlhf_type dpo --model_id_or_path /data/TeleChat2-7B --dataset hjh0119/shareAI-Llama3-DPO-zh-en-emoji#100 /data/Telechat/dpo_refusal_dataset_official.jsonl#70 --num_train_epochs 10 --per_device_train_batch_size 1 --per_device_eval_batch_size 1 --learning_rate 1e-4 --lora_rank 8 --lora_alpha 32 --gradient_accumulation_steps 8 --eval_steps 10 --save_steps 10 --save_total_limit 5 --logging_steps 5 --max_length 2048 --output_dir output --warmup_ratio 0.05 --dataloader_num_workers 4

报错如下：

[INFO:swift] The RLHFArguments will be saved in: /data/Telechat/TeleChat2/TeleChat2-7B/output/telechat2-115b/v9-20250123-121710/sft_args.json
[INFO:swift] The DPOConfig will be saved in: /data/Telechat/TeleChat2/TeleChat2-7B/output/telechat2-115b/v9-20250123-121710/training_args.json
[INFO:swift] The logging file will be saved in: /data/Telechat/TeleChat2/TeleChat2-7B/output/telechat2-115b/v9-20250123-121710/logging.jsonl

Train:   0%|          | 0/100 [00:00<?, ?it/s][rank1]: Traceback (most recent call last):
[rank1]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/swift/cli/rlhf.py", line 5, in <module>
[rank1]:     rlhf_main()
[rank1]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/swift/utils/run_utils.py", line 32, in x_main
[rank1]:     result = llm_x(args, **kwargs)
[rank1]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/swift/llm/rlhf.py", line 47, in llm_rlhf
[rank1]:     return trainer_train(
[rank1]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/swift/llm/sft.py", line 496, in trainer_train
[rank1]:     trainer.train(training_args.resume_from_checkpoint)
[rank1]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/swift/trainers/mixin.py", line 493, in train
[rank1]:     res = super().train(resume_from_checkpoint, *args, **kwargs)
[rank1]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/transformers/trainer.py", line 2164, in train
[rank1]:     return inner_training_loop(
[rank1]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/transformers/trainer.py", line 2472, in _inner_training_loop
[rank1]:     batch_samples, num_items_in_batch = self.get_batch_samples(epoch_iterator, num_batches)
[rank1]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/transformers/trainer.py", line 5131, in get_batch_samples
[rank1]:     batch_samples += [next(epoch_iterator)]
[rank1]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/accelerate/data_loader.py", line 552, in __iter__
[rank1]:     current_batch = next(dataloader_iter)
[rank1]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 630, in __next__
[rank1]:     data = self._next_data()
[rank1]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1344, in _next_data
[rank1]:     return self._process_data(data)
[rank1]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1370, in _process_data
[rank1]:     data.reraise()
[rank1]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/torch/_utils.py", line 706, in reraise
[rank1]:     raise exception
[rank1]: RuntimeError: Caught RuntimeError in DataLoader worker process 0.
[rank1]: Original Traceback (most recent call last):
[rank1]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/torch/utils/data/_utils/worker.py", line 309, in _worker_loop
[rank1]:     data = fetcher.fetch(index)  # type: ignore[possibly-undefined]
[rank1]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 55, in fetch
[rank1]:     return self.collate_fn(data)
[rank1]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/swift/llm/utils/template.py", line 4157, in data_collator
[rank1]:     return _data_collator(new_batch or batch, padding_to)
[rank1]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/swift/llm/utils/template.py", line 1051, in data_collator
[rank1]:     res[key] = [torch.tensor(b[key]) for b in batch]
[rank1]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/swift/llm/utils/template.py", line 1051, in <listcomp>
[rank1]:     res[key] = [torch.tensor(b[key]) for b in batch]
[rank1]: RuntimeError: Could not infer dtype of NoneType

[rank0]: Traceback (most recent call last):
[rank0]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/swift/cli/rlhf.py", line 5, in <module>
[rank0]:     rlhf_main()
[rank0]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/swift/utils/run_utils.py", line 32, in x_main
[rank0]:     result = llm_x(args, **kwargs)
[rank0]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/swift/llm/rlhf.py", line 47, in llm_rlhf
[rank0]:     return trainer_train(
[rank0]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/swift/llm/sft.py", line 496, in trainer_train
[rank0]:     trainer.train(training_args.resume_from_checkpoint)
[rank0]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/swift/trainers/mixin.py", line 493, in train
[rank0]:     res = super().train(resume_from_checkpoint, *args, **kwargs)
[rank0]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/transformers/trainer.py", line 2164, in train
[rank0]:     return inner_training_loop(
[rank0]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/transformers/trainer.py", line 2472, in _inner_training_loop
[rank0]:     batch_samples, num_items_in_batch = self.get_batch_samples(epoch_iterator, num_batches)
[rank0]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/transformers/trainer.py", line 5131, in get_batch_samples
[rank0]:     batch_samples += [next(epoch_iterator)]
[rank0]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/accelerate/data_loader.py", line 563, in __iter__
[rank0]:     next_batch = next(dataloader_iter)
[rank0]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 630, in __next__
[rank0]:     data = self._next_data()
[rank0]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1324, in _next_data
[rank0]:     return self._process_data(data)
[rank0]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1370, in _process_data
[rank0]:     data.reraise()
[rank0]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/torch/_utils.py", line 706, in reraise
[rank0]:     raise exception
[rank0]: RuntimeError: Caught RuntimeError in DataLoader worker process 1.
[rank0]: Original Traceback (most recent call last):
[rank0]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/torch/utils/data/_utils/worker.py", line 309, in _worker_loop
[rank0]:     data = fetcher.fetch(index)  # type: ignore[possibly-undefined]
[rank0]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 55, in fetch
[rank0]:     return self.collate_fn(data)
[rank0]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/swift/llm/utils/template.py", line 4157, in data_collator
[rank0]:     return _data_collator(new_batch or batch, padding_to)
[rank0]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/swift/llm/utils/template.py", line 1051, in data_collator
[rank0]:     res[key] = [torch.tensor(b[key]) for b in batch]
[rank0]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/swift/llm/utils/template.py", line 1051, in <listcomp>
[rank0]:     res[key] = [torch.tensor(b[key]) for b in batch]
[rank0]: RuntimeError: Could not infer dtype of NoneType

Exception in thread Exception in thread Thread-3:
Traceback (most recent call last):
  File "/root/anaconda3/envs/telechat2/lib/python3.9/threading.py", line 980, in _bootstrap_inner
    self.run()
  File "/root/anaconda3/envs/telechat2/lib/python3.9/threading.py", line 917, in run
    self._target(*self._args, **self._kwargs)
  File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/torch/utils/data/_utils/pin_memory.py", line 55, in _pin_memory_loop
    
Train:   0%|          | 0/100 [00:00<?, ?it/s]
E0123 12:17:47.529269 140125589083328 torch/distributed/elastic/multiprocessing/api.py:833] failed (exitcode: 1) local_rank: 0 (pid: 3785853) of binary: /root/anaconda3/envs/telechat2/bin/python
Traceback (most recent call last):
  File "/root/anaconda3/envs/telechat2/lib/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/root/anaconda3/envs/telechat2/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/torch/distributed/run.py", line 905, in <module>
    main()
  File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 348, in wrapper
    return f(*args, **kwargs)
  File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/torch/distributed/run.py", line 901, in main
    run(args)
  File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/torch/distributed/run.py", line 892, in run
    elastic_launch(
  File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 133, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
  File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 264, in launch_agent
    raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError: 
============================================================
/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/swift/cli/rlhf.py FAILED
------------------------------------------------------------
Failures:
[1]:
  time      : 2025-01-23_12:17:47
  host      : ecm-22b5
  rank      : 1 (local_rank: 1)
  exitcode  : 1 (pid: 3785854)
  error_file: <N/A>
  traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
  time      : 2025-01-23_12:17:47
  host      : ecm-22b5
  rank      : 0 (local_rank: 0)
  exitcode  : 1 (pid: 3785853)
  error_file: <N/A>
  traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
============================================================

Your hardware and system info
ms-swift Version: 2.6.1

The text was updated successfully, but these errors were encountered:

skdom6 · 2025-01-23T04:56:53Z

你解决了吗，我也遇到了同样的问题

WjMessi1 · 2025-01-23T05:01:26Z

你解决了吗，我也遇到了同样的问题

暂时没有，等魔塔的大佬来解答

skdom6 · 2025-01-23T05:16:20Z

不好意，我的应该和你的不一样，刚刚解决了，是我自己数据弄错了

Jintao-Huang · 2025-01-23T05:56:16Z

我这里测试是正常的

尝试升级一下ms-swift试试呢

WjMessi1 · 2025-01-23T06:17:13Z

我这里测试是正常的

尝试升级一下ms-swift试试呢

好的，我试试

WjMessi1 · 2025-01-23T06:57:31Z

我这里测试是正常的

尝试升级一下ms-swift试试呢

大佬您好，我重新安装最新版本的ms-swift（3.0.3版本），运行下面的dpo指令：

NPROC_PER_NODE=2 CUDA_VISIBLE_DEVICES=0,1 swift rlhf --model_type telechat2 --rlhf_type dpo --model /data/Telechat/TeleChat2/TeleChat2-7B --dataset /data/Telechat/dpo_refusal_dataset_official.jsonl --num_train_epochs 10 --per_device_train_batch_size 1 --per_device_eval_batch_size 1 --learning_rate 1e-4 --lora_rank 8 --lora_alpha 32 --gradient_accumulation_steps 8 --eval_steps 10 --save_steps 10 --save_total_limit 5 --logging_steps 5 --max_length 2048 --output_dir output --ddp_find_unused_parameters true --warmup_ratio 0.05 --dataloader_num_workers 4 --deepspeed zero2

有新报错如下：

[ERROR:modelscope] The request model: unknown does not exist!                                                                                                                                           
[ERROR:modelscope] The request model: unknown does not exist!                                                                                                                                           
/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/swift/trainers/mixin.py:77: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `DPOTrainer.__init__`. Use `proc
essing_class` instead.                                                                                                                                                                                  
  super().__init__(                                                                                                                                                                                     
Detected kernel version 5.4.0, which is below the recommended minimum of 5.5.0; this can cause the process to hang. It is recommended to upgrade the kernel to the minimum version or higher.           
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.                                                                                                                   
[INFO:swift] The logging file will be saved in: /data/Telechat/TeleChat2/TeleChat2-7B/output/v5-20250123-145217/logging.jsonl                                                                           
[ERROR:modelscope] The request model: unknown does not exist!                                                                                                                                           
[ERROR:modelscope] The request model: unknown does not exist!                                                                                                                                           
/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/swift/trainers/mixin.py:77: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `DPOTrainer.__init__`. Use `proc
essing_class` instead.                                                                                                                                                                                  
  super().__init__(                                                                                                                                                                                     
You are using an old version of the checkpointing format that is deprecated (We will also silently ignore `gradient_checkpointing_kwargs` in case you passed it).Please update to the new format on your
 modeling file. To use the new format, you need to completely remove the definition of the method `_set_gradient_checkpointing` in your model.                                                          
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.                                                                                                                   
You are using an old version of the checkpointing format that is deprecated (We will also silently ignore `gradient_checkpointing_kwargs` in case you passed it).Please update to the new format on your
 modeling file. To use the new format, you need to completely remove the definition of the method `_set_gradient_checkpointing` in your model.                                                          
[rank1]: Traceback (most recent call last):                                                                                                                                                             
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/swift/cli/rlhf.py", line 5, in <module>                                                                                      
[rank1]:     rlhf_main()                                                                                                                                                                                
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/swift/llm/train/rlhf.py", line 92, in rlhf_main                                                                              
[rank1]:     return SwiftRLHF(args).main()                                                                                                                                                              
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/swift/llm/base.py", line 46, in main                                                                                         
[rank1]:     result = self.run()                                                                                                                                                                        
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/swift/llm/train/sft.py", line 137, in run                                                                                    
[rank1]:     return self.train(trainer)                                                                                                                                                                 
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/swift/llm/train/sft.py", line 189, in train                                                                                  
[rank1]:     trainer.train(trainer.args.resume_from_checkpoint)                                                                                                                                         
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/swift/trainers/mixin.py", line 261, in train                                                                                 
[rank1]:     res = super().train(*args, **kwargs)                                                                                                                                                       
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/transformers/trainer.py", line 2164, in train                                                                                
[rank1]:     return inner_training_loop(                                                                                                                                                                
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/transformers/trainer.py", line 2524, in _inner_training_loop
[rank1]:     tr_loss_step = self.training_step(model, inputs, num_items_in_batch)
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/transformers/trainer.py", line 3654, in training_step                                                              [150/1961]
[rank1]:     loss = self.compute_loss(model, inputs, num_items_in_batch=num_items_in_batch)                                                                                                             
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/swift/trainers/rlhf_trainer/rlhf_mixin.py", line 155, in compute_loss                                                        
[rank1]:     res = super().compute_loss(model, inputs, return_outputs=return_outputs)                                                                                                                   
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/trl/trainer/dpo_trainer.py", line 1489, in compute_loss                                                                      
[rank1]:     loss, metrics = self.get_batch_loss_metrics(model, inputs, train_eval="train")                                                                                                             
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/trl/trainer/dpo_trainer.py", line 1415, in get_batch_loss_metrics                                                            
[rank1]:     forward_output = self.concatenated_forward(model, batch)                                                                                                                                   
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/swift/trainers/rlhf_trainer/rlhf_mixin.py", line 122, in concatenated_forward                                                
[rank1]:     outputs = model(**model_kwargs, use_cache=False)                                                                                                                                           
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl                                                                
[rank1]:     return self._call_impl(*args, **kwargs)                                                                                                                                                    
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl                                                                        
[rank1]:     return forward_call(*args, **kwargs)                                                                                                                                                       
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/deepspeed/utils/nvtx.py", line 18, in wrapped_fn                                                                             
[rank1]:     ret_val = func(*args, **kwargs)                                                                                                                                                            
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/deepspeed/runtime/engine.py", line 1914, in forward                                                                          
[rank1]:     loss = self.module(*inputs, **kwargs)                                                                                                                                                      
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl                                                                
[rank1]:     return self._call_impl(*args, **kwargs)                                                                                                                                                    
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl                                                                        
[rank1]:     return forward_call(*args, **kwargs)                                                                                                                                                       
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/peft/peft_model.py", line 1719, in forward                                                                                   
[rank1]:     return self.base_model(                                                                                                                                                                    
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl                                                                
[rank1]:     return self._call_impl(*args, **kwargs)                                                                                                                                                    
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl                                                                        
[rank1]:     return forward_call(*args, **kwargs)                                                                                                                                                       
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/peft/tuners/tuners_utils.py", line 197, in forward                                                                           
[rank1]:     return self.model.forward(*args, **kwargs)                                                                                                                                                 
[rank1]:   File "/root/.cache/huggingface/modules/transformers_modules/TeleChat2-7B/modeling_telechat2.py", line 821, in forward                                                                        
[rank1]:     transformer_outputs = self.transformer(                                                                                                                                                    
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl                                                                
[rank1]:     return self._call_impl(*args, **kwargs)                                                                                                                                                    
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl                                                                        
[rank1]:     return forward_call(*args, **kwargs)                                                                                                                                                       
[rank1]:   File "/root/.cache/huggingface/modules/transformers_modules/TeleChat2-7B/modeling_telechat2.py", line 721, in forward                                                                        
[rank1]:     outputs = torch.utils.checkpoint.checkpoint(                                                                                                                                               
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/swift/trainers/arguments.py", line 49, in _new_checkpoint                                                                    
[rank1]:     return _old_checkpoint(*args, use_reentrant=use_reentrant_, **kwargs)                                                                                                                      
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/_compile.py", line 32, in inner                                                                                        
[rank1]:     return disable_fn(*args, **kwargs)                                                                                                                                                         
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 632, in _fn                                                                               
[rank1]:     return fn(*args, **kwargs)                                                                                                                                                                 
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/utils/checkpoint.py", line 489, in checkpoint                                                                          
[rank1]:     return CheckpointFunction.apply(function, preserve, *args)                                                                                                                                 
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/autograd/function.py", line 575, in apply                                                                              
[rank1]:     return super().apply(*args, **kwargs)  # type: ignore[misc]                                                                                                                                
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/utils/checkpoint.py", line 264, in forward                                                                             
[rank1]:     outputs = run_function(*args)   
[rank1]:   File "/root/.cache/huggingface/modules/transformers_modules/TeleChat2-7B/modeling_telechat2.py", line 717, in custom_forward                                                                 
[rank1]:     return module(*inputs, use_cache=use_cache, output_attentions=output_attentions)                                                                                                           
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl                                                                
[rank1]:     return self._call_impl(*args, **kwargs)                                                                                                                                                    
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl                                                                        
[rank1]:     return forward_call(*args, **kwargs)                                                                                                                                                       
[rank1]:   File "/root/.cache/huggingface/modules/transformers_modules/TeleChat2-7B/modeling_telechat2.py", line 551, in forward                                                                        
[rank1]:     attn_outputs = self.self_attention(                                                                                                                                                        
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl                                                                
[rank1]:     return self._call_impl(*args, **kwargs)                                                                                                                                                    
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl                                                                        
[rank1]:     return forward_call(*args, **kwargs)                                                                                                                                                       
[rank1]:   File "/root/.cache/huggingface/modules/transformers_modules/TeleChat2-7B/modeling_telechat2.py", line 493, in forward                                                                        
[rank1]:     context_layer = torch.bmm(attention_probs_reshaped, value_layer.transpose(0, 1))                                                                                                           
[rank1]: RuntimeError: expected scalar type BFloat16 but found Float                                                                                                                                    
Train:   0%|                                                                                                                                                                     | 0/10 [00:00<?, ?it/s]

应该是telechat2 7b模型的问题，我之前好像修改过参数，我试试modelscope官方原版的。经过测试，原版的一样有这个问题，麻烦大佬帮忙看下能否解决

WjMessi1 · 2025-01-23T07:06:01Z

不好意，我的应该和你的不一样，刚刚解决了，是我自己数据弄错了

可以看下您的运行参数吗？

lonngxiang · 2025-01-24T01:34:51Z

请问需要多大显卡资源能跑呢

Jintao-Huang · 2025-01-24T01:46:10Z

我这里测试是正常的
尝试升级一下ms-swift试试呢

大佬您好，我重新安装最新版本的ms-swift（3.0.3版本），运行下面的dpo指令：

NPROC_PER_NODE=2 CUDA_VISIBLE_DEVICES=0,1 swift rlhf --model_type telechat2 --rlhf_type dpo --model /data/Telechat/TeleChat2/TeleChat2-7B --dataset /data/Telechat/dpo_refusal_dataset_official.jsonl --num_train_epochs 10 --per_device_train_batch_size 1 --per_device_eval_batch_size 1 --learning_rate 1e-4 --lora_rank 8 --lora_alpha 32 --gradient_accumulation_steps 8 --eval_steps 10 --save_steps 10 --save_total_limit 5 --logging_steps 5 --max_length 2048 --output_dir output --ddp_find_unused_parameters true --warmup_ratio 0.05 --dataloader_num_workers 4 --deepspeed zero2

有新报错如下：

[ERROR:modelscope] The request model: unknown does not exist!                                                                                                                                           
[ERROR:modelscope] The request model: unknown does not exist!                                                                                                                                           
/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/swift/trainers/mixin.py:77: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `DPOTrainer.__init__`. Use `proc
essing_class` instead.                                                                                                                                                                                  
  super().__init__(                                                                                                                                                                                     
Detected kernel version 5.4.0, which is below the recommended minimum of 5.5.0; this can cause the process to hang. It is recommended to upgrade the kernel to the minimum version or higher.           
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.                                                                                                                   
[INFO:swift] The logging file will be saved in: /data/Telechat/TeleChat2/TeleChat2-7B/output/v5-20250123-145217/logging.jsonl                                                                           
[ERROR:modelscope] The request model: unknown does not exist!                                                                                                                                           
[ERROR:modelscope] The request model: unknown does not exist!                                                                                                                                           
/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/swift/trainers/mixin.py:77: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `DPOTrainer.__init__`. Use `proc
essing_class` instead.                                                                                                                                                                                  
  super().__init__(                                                                                                                                                                                     
You are using an old version of the checkpointing format that is deprecated (We will also silently ignore `gradient_checkpointing_kwargs` in case you passed it).Please update to the new format on your
 modeling file. To use the new format, you need to completely remove the definition of the method `_set_gradient_checkpointing` in your model.                                                          
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.                                                                                                                   
You are using an old version of the checkpointing format that is deprecated (We will also silently ignore `gradient_checkpointing_kwargs` in case you passed it).Please update to the new format on your
 modeling file. To use the new format, you need to completely remove the definition of the method `_set_gradient_checkpointing` in your model.                                                          
[rank1]: Traceback (most recent call last):                                                                                                                                                             
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/swift/cli/rlhf.py", line 5, in <module>                                                                                      
[rank1]:     rlhf_main()                                                                                                                                                                                
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/swift/llm/train/rlhf.py", line 92, in rlhf_main                                                                              
[rank1]:     return SwiftRLHF(args).main()                                                                                                                                                              
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/swift/llm/base.py", line 46, in main                                                                                         
[rank1]:     result = self.run()                                                                                                                                                                        
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/swift/llm/train/sft.py", line 137, in run                                                                                    
[rank1]:     return self.train(trainer)                                                                                                                                                                 
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/swift/llm/train/sft.py", line 189, in train                                                                                  
[rank1]:     trainer.train(trainer.args.resume_from_checkpoint)                                                                                                                                         
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/swift/trainers/mixin.py", line 261, in train                                                                                 
[rank1]:     res = super().train(*args, **kwargs)                                                                                                                                                       
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/transformers/trainer.py", line 2164, in train                                                                                
[rank1]:     return inner_training_loop(                                                                                                                                                                
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/transformers/trainer.py", line 2524, in _inner_training_loop
[rank1]:     tr_loss_step = self.training_step(model, inputs, num_items_in_batch)
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/transformers/trainer.py", line 3654, in training_step                                                              [150/1961]
[rank1]:     loss = self.compute_loss(model, inputs, num_items_in_batch=num_items_in_batch)                                                                                                             
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/swift/trainers/rlhf_trainer/rlhf_mixin.py", line 155, in compute_loss                                                        
[rank1]:     res = super().compute_loss(model, inputs, return_outputs=return_outputs)                                                                                                                   
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/trl/trainer/dpo_trainer.py", line 1489, in compute_loss                                                                      
[rank1]:     loss, metrics = self.get_batch_loss_metrics(model, inputs, train_eval="train")                                                                                                             
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/trl/trainer/dpo_trainer.py", line 1415, in get_batch_loss_metrics                                                            
[rank1]:     forward_output = self.concatenated_forward(model, batch)                                                                                                                                   
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/swift/trainers/rlhf_trainer/rlhf_mixin.py", line 122, in concatenated_forward                                                
[rank1]:     outputs = model(**model_kwargs, use_cache=False)                                                                                                                                           
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl                                                                
[rank1]:     return self._call_impl(*args, **kwargs)                                                                                                                                                    
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl                                                                        
[rank1]:     return forward_call(*args, **kwargs)                                                                                                                                                       
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/deepspeed/utils/nvtx.py", line 18, in wrapped_fn                                                                             
[rank1]:     ret_val = func(*args, **kwargs)                                                                                                                                                            
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/deepspeed/runtime/engine.py", line 1914, in forward                                                                          
[rank1]:     loss = self.module(*inputs, **kwargs)                                                                                                                                                      
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl                                                                
[rank1]:     return self._call_impl(*args, **kwargs)                                                                                                                                                    
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl                                                                        
[rank1]:     return forward_call(*args, **kwargs)                                                                                                                                                       
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/peft/peft_model.py", line 1719, in forward                                                                                   
[rank1]:     return self.base_model(                                                                                                                                                                    
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl                                                                
[rank1]:     return self._call_impl(*args, **kwargs)                                                                                                                                                    
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl                                                                        
[rank1]:     return forward_call(*args, **kwargs)                                                                                                                                                       
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/peft/tuners/tuners_utils.py", line 197, in forward                                                                           
[rank1]:     return self.model.forward(*args, **kwargs)                                                                                                                                                 
[rank1]:   File "/root/.cache/huggingface/modules/transformers_modules/TeleChat2-7B/modeling_telechat2.py", line 821, in forward                                                                        
[rank1]:     transformer_outputs = self.transformer(                                                                                                                                                    
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl                                                                
[rank1]:     return self._call_impl(*args, **kwargs)                                                                                                                                                    
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl                                                                        
[rank1]:     return forward_call(*args, **kwargs)                                                                                                                                                       
[rank1]:   File "/root/.cache/huggingface/modules/transformers_modules/TeleChat2-7B/modeling_telechat2.py", line 721, in forward                                                                        
[rank1]:     outputs = torch.utils.checkpoint.checkpoint(                                                                                                                                               
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/swift/trainers/arguments.py", line 49, in _new_checkpoint                                                                    
[rank1]:     return _old_checkpoint(*args, use_reentrant=use_reentrant_, **kwargs)                                                                                                                      
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/_compile.py", line 32, in inner                                                                                        
[rank1]:     return disable_fn(*args, **kwargs)                                                                                                                                                         
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 632, in _fn                                                                               
[rank1]:     return fn(*args, **kwargs)                                                                                                                                                                 
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/utils/checkpoint.py", line 489, in checkpoint                                                                          
[rank1]:     return CheckpointFunction.apply(function, preserve, *args)                                                                                                                                 
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/autograd/function.py", line 575, in apply                                                                              
[rank1]:     return super().apply(*args, **kwargs)  # type: ignore[misc]                                                                                                                                
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/utils/checkpoint.py", line 264, in forward                                                                             
[rank1]:     outputs = run_function(*args)   
[rank1]:   File "/root/.cache/huggingface/modules/transformers_modules/TeleChat2-7B/modeling_telechat2.py", line 717, in custom_forward                                                                 
[rank1]:     return module(*inputs, use_cache=use_cache, output_attentions=output_attentions)                                                                                                           
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl                                                                
[rank1]:     return self._call_impl(*args, **kwargs)                                                                                                                                                    
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl                                                                        
[rank1]:     return forward_call(*args, **kwargs)                                                                                                                                                       
[rank1]:   File "/root/.cache/huggingface/modules/transformers_modules/TeleChat2-7B/modeling_telechat2.py", line 551, in forward                                                                        
[rank1]:     attn_outputs = self.self_attention(                                                                                                                                                        
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl                                                                
[rank1]:     return self._call_impl(*args, **kwargs)                                                                                                                                                    
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl                                                                        
[rank1]:     return forward_call(*args, **kwargs)                                                                                                                                                       
[rank1]:   File "/root/.cache/huggingface/modules/transformers_modules/TeleChat2-7B/modeling_telechat2.py", line 493, in forward                                                                        
[rank1]:     context_layer = torch.bmm(attention_probs_reshaped, value_layer.transpose(0, 1))                                                                                                           
[rank1]: RuntimeError: expected scalar type BFloat16 but found Float                                                                                                                                    
Train:   0%|                                                                                                                                                                     | 0/10 [00:00<?, ?it/s]

应该是telechat2 7b模型的问题，我之前好像修改过参数，我试试modelscope官方原版的。经过测试，原版的一样有这个问题，麻烦大佬帮忙看下能否解决

--dtype float16 或者 float32试试

xiezhipeng-git · 2025-01-28T12:15:58Z

@Jintao-Huang
最新的github main 分支代码
当使用json文件作为dpo的数据的时候
由于数据里只有question chosen rejected
这三个key .ms-swift 在经过 core

def preprocess(self, row: Dict[str, Any]) -> Optional[Dict[str, Any]]:
        response = row.pop('response', None)
        if response is not None:
            if isinstance(response, (list, tuple)):
                # sometimes response is a list, pick one randomly
                response = self.random_state.choice(response)
        history = row.pop('history', None) or []
        query = row.pop('query', None)
        system = row.pop('system', None)
        if isinstance(history, str):  # e.g. "[['query1', 'response1']]"
            history = ast.literal_eval(history)
        history.append([query, response])

        row.update({'messages': history_to_messages(history, system)})
        return row

函数的时候会认为没有历史数据。然后增加row["message"] = []
接着在core 里

for r in row:
                    self._check_messages(r)
                    self._check_rejected_response(r)
                    self._cast_images(r)

在_check_messages 内部

def _check_messages(row: Dict[str, Any]) -> None:
        if 'messages' not in row:
            return
        messages = row['messages']
        assert len(messages) > 0, f'messages: {messages}'
        # fix swift/SlimOrca
        for message in messages:
            keys = set(message.keys()) - {'role', 'content'}
            for key in keys:
                message.pop(key)

        if messages[0]['role'] == 'system':
            messages = messages[1:]
        if messages and messages[0]['role'] == 'assistant':
            messages = [{'role': 'user', 'content': ''}] + messages  # pretrain
        for user_message, assistant_message in zip(messages[::2], messages[1::2]):
            if (user_message['role'] not in {'user', 'tool'} or 'content' not in user_message
                    or user_message['content'] is None):
                raise ValueError(f'user_message: {user_message}')
            if (assistant_message['role'] not in {'assistant'} or 'content' not in assistant_message
                    or assistant_message['content'] in {'', None}):
                raise ValueError(f'assistant_message: {assistant_message}')

由于已经补充了messages key 导致代码在

        assert len(messages) > 0, f'messages: {messages}'

直接报错。这应该不是数据集的问题吧。虽然数据集是我自己制作的。但是这里的逻辑已经冲突了啊。

还有另一个值得讨论的问题。即ms-swift的代码里。dpo 数据作为选择与被选择的答案。有可能两个答案都是错误的。只是选择一个相对好的。但是我看代码里是直接把这些选择当做普通正确答案来命名的。（后面的处理还没看过代码）不知道这里的相关的代码的命名是否不够合适？（如果后续的处理需要统一才这样命名倒是没有影响）我就是提醒下

    def _check_rejected_response(row: Dict[str, Any]) -> None:
        if 'rejected_messages' in row:
            chosen_messages = row['messages']
            rejected_messages = row['rejected_messages']
            messages = []
            rejected_response = None
            for chosen_user, chosen_assistant, rejected_user, rejected_assistant in zip(
                    chosen_messages[::2], chosen_messages[1::2], rejected_messages[::2], rejected_messages[1::2]):
                assert chosen_user == rejected_user
                messages.append(chosen_user)
                messages.append(chosen_assistant)
                if chosen_assistant != rejected_assistant:
                    rejected_response = rejected_assistant['content']
            row['messages'] = messages
            row['rejected_response'] = rejected_response

        if 'rejected_response' in row:
            messages = row['messages']
            rejected_response = row['rejected_response']
            if rejected_response is None or rejected_response == messages[-1]['content']:
                raise ValueError(f'rejected_response: {rejected_response}')

Jintao-Huang · 2025-01-28T12:54:46Z

history.append([query, response])
row.update({'messages': history_to_messages(history, system)})

query和response也会放入history的

xiezhipeng-git · 2025-01-28T13:02:34Z

history.append([query, response]) row.update({'messages': history_to_messages(history, system)})

query和response也会放入history的

但是我这里报错了。得到的是[]
因为history 与 system 都是None .意思是必须有system 数据？
但是我看代码里的示例
https://www.modelscope.cn/datasets/swift/zhihu_rlhf_3k/dataPeview
没有system 这个数据啊。
然后另一个dpo示例数据。格式完全不一样
https://modelscope.cn/datasets/hjh0119/shareAI-Llama3-DPO-zh-en-emoji/dataPeview

Jintao-Huang · 2025-01-28T13:46:18Z

参考这里呢：https://swift.readthedocs.io/zh-cn/latest/Customization/%E8%87%AA%E5%AE%9A%E4%B9%89%E6%95%B0%E6%8D%AE%E9%9B%86.html#dpo-orpo-cpo-simpo-rm

xiezhipeng-git · 2025-01-28T15:01:38Z

参考这里呢：https://swift.readthedocs.io/zh-cn/latest/Customization/%E8%87%AA%E5%AE%9A%E4%B9%89%E6%95%B0%E6%8D%AE%E9%9B%86.html#dpo-orpo-cpo-simpo-rm

好的，我试一下。不过如果这样的话，说明代码里的示例数据需要修改了

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

使用官方提供的dpo数据集模板报错 #2968

使用官方提供的dpo数据集模板报错 #2968

WjMessi1 commented Jan 23, 2025 •

edited

Loading

skdom6 commented Jan 23, 2025

WjMessi1 commented Jan 23, 2025

skdom6 commented Jan 23, 2025

Jintao-Huang commented Jan 23, 2025

WjMessi1 commented Jan 23, 2025

WjMessi1 commented Jan 23, 2025 •

edited

Loading

WjMessi1 commented Jan 23, 2025

lonngxiang commented Jan 24, 2025

Jintao-Huang commented Jan 24, 2025

xiezhipeng-git commented Jan 28, 2025 •

edited

Loading

Jintao-Huang commented Jan 28, 2025

xiezhipeng-git commented Jan 28, 2025 •

edited

Loading

Jintao-Huang commented Jan 28, 2025

xiezhipeng-git commented Jan 28, 2025

使用官方提供的dpo数据集模板报错 #2968

使用官方提供的dpo数据集模板报错 #2968

Comments

WjMessi1 commented Jan 23, 2025 • edited Loading

skdom6 commented Jan 23, 2025

WjMessi1 commented Jan 23, 2025

skdom6 commented Jan 23, 2025

Jintao-Huang commented Jan 23, 2025

WjMessi1 commented Jan 23, 2025

WjMessi1 commented Jan 23, 2025 • edited Loading

WjMessi1 commented Jan 23, 2025

lonngxiang commented Jan 24, 2025

Jintao-Huang commented Jan 24, 2025

xiezhipeng-git commented Jan 28, 2025 • edited Loading

Jintao-Huang commented Jan 28, 2025

xiezhipeng-git commented Jan 28, 2025 • edited Loading

Jintao-Huang commented Jan 28, 2025

xiezhipeng-git commented Jan 28, 2025

WjMessi1 commented Jan 23, 2025 •

edited

Loading

WjMessi1 commented Jan 23, 2025 •

edited

Loading

xiezhipeng-git commented Jan 28, 2025 •

edited

Loading

xiezhipeng-git commented Jan 28, 2025 •

edited

Loading