Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

使用官方提供的dpo数据集模板报错 #2968

Open
WjMessi1 opened this issue Jan 23, 2025 · 14 comments
Open

使用官方提供的dpo数据集模板报错 #2968

WjMessi1 opened this issue Jan 23, 2025 · 14 comments

Comments

@WjMessi1
Copy link

WjMessi1 commented Jan 23, 2025

当我使用官方提供的dpo数据集模板:

制作成数据集:/data/Telechat/dpo_refusal_dataset_official.jsonl,参与下面的微调训练中。若不添加此数据集,只使用hjh0119/shareAI-Llama3-DPO-zh-en-emoji,则可以正常训练

数据集内容:

{"messages": [{"role": "system", "content": "你是个有用无害的助手"}, {"role": "user", "content": "告诉我明天的天气"}, {"role": "assistant", "content": "明天天气晴朗"}], "rejected_response": "我不知道"}
{"messages": [{"role": "system", "content": "你是个有用无害的数学计算器"}, {"role": "user", "content": "1+1等于几"}, {"role": "assistant", "content": "等于2"}, {"role": "user", "content": "再加1呢"}, {"role": "assistant", "content": "等于3"}], "rejected_response": "我不知道"}

运行dpo微调训练指令,参考脚本:

NPROC_PER_NODE=2 CUDA_VISIBLE_DEVICES=0,1 swift rlhf --model_type telechat2-115b --rlhf_type dpo --model_id_or_path /data/TeleChat2-7B --dataset hjh0119/shareAI-Llama3-DPO-zh-en-emoji#100 /data/Telechat/dpo_refusal_dataset_official.jsonl#70 --num_train_epochs 10 --per_device_train_batch_size 1 --per_device_eval_batch_size 1 --learning_rate 1e-4 --lora_rank 8 --lora_alpha 32 --gradient_accumulation_steps 8 --eval_steps 10 --save_steps 10 --save_total_limit 5 --logging_steps 5 --max_length 2048 --output_dir output --warmup_ratio 0.05 --dataloader_num_workers 4

报错如下:

[INFO:swift] The RLHFArguments will be saved in: /data/Telechat/TeleChat2/TeleChat2-7B/output/telechat2-115b/v9-20250123-121710/sft_args.json
[INFO:swift] The DPOConfig will be saved in: /data/Telechat/TeleChat2/TeleChat2-7B/output/telechat2-115b/v9-20250123-121710/training_args.json
[INFO:swift] The logging file will be saved in: /data/Telechat/TeleChat2/TeleChat2-7B/output/telechat2-115b/v9-20250123-121710/logging.jsonl

Train:   0%|          | 0/100 [00:00<?, ?it/s][rank1]: Traceback (most recent call last):
[rank1]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/swift/cli/rlhf.py", line 5, in <module>
[rank1]:     rlhf_main()
[rank1]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/swift/utils/run_utils.py", line 32, in x_main
[rank1]:     result = llm_x(args, **kwargs)
[rank1]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/swift/llm/rlhf.py", line 47, in llm_rlhf
[rank1]:     return trainer_train(
[rank1]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/swift/llm/sft.py", line 496, in trainer_train
[rank1]:     trainer.train(training_args.resume_from_checkpoint)
[rank1]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/swift/trainers/mixin.py", line 493, in train
[rank1]:     res = super().train(resume_from_checkpoint, *args, **kwargs)
[rank1]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/transformers/trainer.py", line 2164, in train
[rank1]:     return inner_training_loop(
[rank1]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/transformers/trainer.py", line 2472, in _inner_training_loop
[rank1]:     batch_samples, num_items_in_batch = self.get_batch_samples(epoch_iterator, num_batches)
[rank1]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/transformers/trainer.py", line 5131, in get_batch_samples
[rank1]:     batch_samples += [next(epoch_iterator)]
[rank1]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/accelerate/data_loader.py", line 552, in __iter__
[rank1]:     current_batch = next(dataloader_iter)
[rank1]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 630, in __next__
[rank1]:     data = self._next_data()
[rank1]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1344, in _next_data
[rank1]:     return self._process_data(data)
[rank1]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1370, in _process_data
[rank1]:     data.reraise()
[rank1]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/torch/_utils.py", line 706, in reraise
[rank1]:     raise exception
[rank1]: RuntimeError: Caught RuntimeError in DataLoader worker process 0.
[rank1]: Original Traceback (most recent call last):
[rank1]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/torch/utils/data/_utils/worker.py", line 309, in _worker_loop
[rank1]:     data = fetcher.fetch(index)  # type: ignore[possibly-undefined]
[rank1]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 55, in fetch
[rank1]:     return self.collate_fn(data)
[rank1]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/swift/llm/utils/template.py", line 4157, in data_collator
[rank1]:     return _data_collator(new_batch or batch, padding_to)
[rank1]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/swift/llm/utils/template.py", line 1051, in data_collator
[rank1]:     res[key] = [torch.tensor(b[key]) for b in batch]
[rank1]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/swift/llm/utils/template.py", line 1051, in <listcomp>
[rank1]:     res[key] = [torch.tensor(b[key]) for b in batch]
[rank1]: RuntimeError: Could not infer dtype of NoneType

[rank0]: Traceback (most recent call last):
[rank0]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/swift/cli/rlhf.py", line 5, in <module>
[rank0]:     rlhf_main()
[rank0]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/swift/utils/run_utils.py", line 32, in x_main
[rank0]:     result = llm_x(args, **kwargs)
[rank0]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/swift/llm/rlhf.py", line 47, in llm_rlhf
[rank0]:     return trainer_train(
[rank0]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/swift/llm/sft.py", line 496, in trainer_train
[rank0]:     trainer.train(training_args.resume_from_checkpoint)
[rank0]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/swift/trainers/mixin.py", line 493, in train
[rank0]:     res = super().train(resume_from_checkpoint, *args, **kwargs)
[rank0]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/transformers/trainer.py", line 2164, in train
[rank0]:     return inner_training_loop(
[rank0]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/transformers/trainer.py", line 2472, in _inner_training_loop
[rank0]:     batch_samples, num_items_in_batch = self.get_batch_samples(epoch_iterator, num_batches)
[rank0]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/transformers/trainer.py", line 5131, in get_batch_samples
[rank0]:     batch_samples += [next(epoch_iterator)]
[rank0]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/accelerate/data_loader.py", line 563, in __iter__
[rank0]:     next_batch = next(dataloader_iter)
[rank0]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 630, in __next__
[rank0]:     data = self._next_data()
[rank0]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1324, in _next_data
[rank0]:     return self._process_data(data)
[rank0]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1370, in _process_data
[rank0]:     data.reraise()
[rank0]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/torch/_utils.py", line 706, in reraise
[rank0]:     raise exception
[rank0]: RuntimeError: Caught RuntimeError in DataLoader worker process 1.
[rank0]: Original Traceback (most recent call last):
[rank0]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/torch/utils/data/_utils/worker.py", line 309, in _worker_loop
[rank0]:     data = fetcher.fetch(index)  # type: ignore[possibly-undefined]
[rank0]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 55, in fetch
[rank0]:     return self.collate_fn(data)
[rank0]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/swift/llm/utils/template.py", line 4157, in data_collator
[rank0]:     return _data_collator(new_batch or batch, padding_to)
[rank0]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/swift/llm/utils/template.py", line 1051, in data_collator
[rank0]:     res[key] = [torch.tensor(b[key]) for b in batch]
[rank0]:   File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/swift/llm/utils/template.py", line 1051, in <listcomp>
[rank0]:     res[key] = [torch.tensor(b[key]) for b in batch]
[rank0]: RuntimeError: Could not infer dtype of NoneType

Exception in thread Exception in thread Thread-3:
Traceback (most recent call last):
  File "/root/anaconda3/envs/telechat2/lib/python3.9/threading.py", line 980, in _bootstrap_inner
    self.run()
  File "/root/anaconda3/envs/telechat2/lib/python3.9/threading.py", line 917, in run
    self._target(*self._args, **self._kwargs)
  File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/torch/utils/data/_utils/pin_memory.py", line 55, in _pin_memory_loop
    
Train:   0%|          | 0/100 [00:00<?, ?it/s]
E0123 12:17:47.529269 140125589083328 torch/distributed/elastic/multiprocessing/api.py:833] failed (exitcode: 1) local_rank: 0 (pid: 3785853) of binary: /root/anaconda3/envs/telechat2/bin/python
Traceback (most recent call last):
  File "/root/anaconda3/envs/telechat2/lib/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/root/anaconda3/envs/telechat2/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/torch/distributed/run.py", line 905, in <module>
    main()
  File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 348, in wrapper
    return f(*args, **kwargs)
  File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/torch/distributed/run.py", line 901, in main
    run(args)
  File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/torch/distributed/run.py", line 892, in run
    elastic_launch(
  File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 133, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
  File "/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 264, in launch_agent
    raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError: 
============================================================
/root/anaconda3/envs/telechat2/lib/python3.9/site-packages/swift/cli/rlhf.py FAILED
------------------------------------------------------------
Failures:
[1]:
  time      : 2025-01-23_12:17:47
  host      : ecm-22b5
  rank      : 1 (local_rank: 1)
  exitcode  : 1 (pid: 3785854)
  error_file: <N/A>
  traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
  time      : 2025-01-23_12:17:47
  host      : ecm-22b5
  rank      : 0 (local_rank: 0)
  exitcode  : 1 (pid: 3785853)
  error_file: <N/A>
  traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
============================================================

Your hardware and system info
ms-swift Version: 2.6.1

@skdom6
Copy link

skdom6 commented Jan 23, 2025

你解决了吗,我也遇到了同样的问题

@WjMessi1
Copy link
Author

你解决了吗,我也遇到了同样的问题

暂时没有,等魔塔的大佬来解答

@skdom6
Copy link

skdom6 commented Jan 23, 2025

不好意,我的应该和你的不一样,刚刚解决了,是我自己数据弄错了

@Jintao-Huang
Copy link
Collaborator

我这里测试是正常的

尝试升级一下ms-swift试试呢

@WjMessi1
Copy link
Author

我这里测试是正常的

尝试升级一下ms-swift试试呢

好的,我试试

@WjMessi1
Copy link
Author

WjMessi1 commented Jan 23, 2025

我这里测试是正常的

尝试升级一下ms-swift试试呢

大佬您好,我重新安装最新版本的ms-swift(3.0.3版本),运行下面的dpo指令:

NPROC_PER_NODE=2 CUDA_VISIBLE_DEVICES=0,1 swift rlhf --model_type telechat2 --rlhf_type dpo --model /data/Telechat/TeleChat2/TeleChat2-7B --dataset /data/Telechat/dpo_refusal_dataset_official.jsonl --num_train_epochs 10 --per_device_train_batch_size 1 --per_device_eval_batch_size 1 --learning_rate 1e-4 --lora_rank 8 --lora_alpha 32 --gradient_accumulation_steps 8 --eval_steps 10 --save_steps 10 --save_total_limit 5 --logging_steps 5 --max_length 2048 --output_dir output --ddp_find_unused_parameters true --warmup_ratio 0.05 --dataloader_num_workers 4 --deepspeed zero2

有新报错如下:

[ERROR:modelscope] The request model: unknown does not exist!                                                                                                                                           
[ERROR:modelscope] The request model: unknown does not exist!                                                                                                                                           
/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/swift/trainers/mixin.py:77: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `DPOTrainer.__init__`. Use `proc
essing_class` instead.                                                                                                                                                                                  
  super().__init__(                                                                                                                                                                                     
Detected kernel version 5.4.0, which is below the recommended minimum of 5.5.0; this can cause the process to hang. It is recommended to upgrade the kernel to the minimum version or higher.           
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.                                                                                                                   
[INFO:swift] The logging file will be saved in: /data/Telechat/TeleChat2/TeleChat2-7B/output/v5-20250123-145217/logging.jsonl                                                                           
[ERROR:modelscope] The request model: unknown does not exist!                                                                                                                                           
[ERROR:modelscope] The request model: unknown does not exist!                                                                                                                                           
/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/swift/trainers/mixin.py:77: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `DPOTrainer.__init__`. Use `proc
essing_class` instead.                                                                                                                                                                                  
  super().__init__(                                                                                                                                                                                     
You are using an old version of the checkpointing format that is deprecated (We will also silently ignore `gradient_checkpointing_kwargs` in case you passed it).Please update to the new format on your
 modeling file. To use the new format, you need to completely remove the definition of the method `_set_gradient_checkpointing` in your model.                                                          
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.                                                                                                                   
You are using an old version of the checkpointing format that is deprecated (We will also silently ignore `gradient_checkpointing_kwargs` in case you passed it).Please update to the new format on your
 modeling file. To use the new format, you need to completely remove the definition of the method `_set_gradient_checkpointing` in your model.                                                          
[rank1]: Traceback (most recent call last):                                                                                                                                                             
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/swift/cli/rlhf.py", line 5, in <module>                                                                                      
[rank1]:     rlhf_main()                                                                                                                                                                                
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/swift/llm/train/rlhf.py", line 92, in rlhf_main                                                                              
[rank1]:     return SwiftRLHF(args).main()                                                                                                                                                              
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/swift/llm/base.py", line 46, in main                                                                                         
[rank1]:     result = self.run()                                                                                                                                                                        
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/swift/llm/train/sft.py", line 137, in run                                                                                    
[rank1]:     return self.train(trainer)                                                                                                                                                                 
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/swift/llm/train/sft.py", line 189, in train                                                                                  
[rank1]:     trainer.train(trainer.args.resume_from_checkpoint)                                                                                                                                         
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/swift/trainers/mixin.py", line 261, in train                                                                                 
[rank1]:     res = super().train(*args, **kwargs)                                                                                                                                                       
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/transformers/trainer.py", line 2164, in train                                                                                
[rank1]:     return inner_training_loop(                                                                                                                                                                
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/transformers/trainer.py", line 2524, in _inner_training_loop
[rank1]:     tr_loss_step = self.training_step(model, inputs, num_items_in_batch)
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/transformers/trainer.py", line 3654, in training_step                                                              [150/1961]
[rank1]:     loss = self.compute_loss(model, inputs, num_items_in_batch=num_items_in_batch)                                                                                                             
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/swift/trainers/rlhf_trainer/rlhf_mixin.py", line 155, in compute_loss                                                        
[rank1]:     res = super().compute_loss(model, inputs, return_outputs=return_outputs)                                                                                                                   
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/trl/trainer/dpo_trainer.py", line 1489, in compute_loss                                                                      
[rank1]:     loss, metrics = self.get_batch_loss_metrics(model, inputs, train_eval="train")                                                                                                             
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/trl/trainer/dpo_trainer.py", line 1415, in get_batch_loss_metrics                                                            
[rank1]:     forward_output = self.concatenated_forward(model, batch)                                                                                                                                   
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/swift/trainers/rlhf_trainer/rlhf_mixin.py", line 122, in concatenated_forward                                                
[rank1]:     outputs = model(**model_kwargs, use_cache=False)                                                                                                                                           
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl                                                                
[rank1]:     return self._call_impl(*args, **kwargs)                                                                                                                                                    
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl                                                                        
[rank1]:     return forward_call(*args, **kwargs)                                                                                                                                                       
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/deepspeed/utils/nvtx.py", line 18, in wrapped_fn                                                                             
[rank1]:     ret_val = func(*args, **kwargs)                                                                                                                                                            
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/deepspeed/runtime/engine.py", line 1914, in forward                                                                          
[rank1]:     loss = self.module(*inputs, **kwargs)                                                                                                                                                      
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl                                                                
[rank1]:     return self._call_impl(*args, **kwargs)                                                                                                                                                    
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl                                                                        
[rank1]:     return forward_call(*args, **kwargs)                                                                                                                                                       
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/peft/peft_model.py", line 1719, in forward                                                                                   
[rank1]:     return self.base_model(                                                                                                                                                                    
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl                                                                
[rank1]:     return self._call_impl(*args, **kwargs)                                                                                                                                                    
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl                                                                        
[rank1]:     return forward_call(*args, **kwargs)                                                                                                                                                       
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/peft/tuners/tuners_utils.py", line 197, in forward                                                                           
[rank1]:     return self.model.forward(*args, **kwargs)                                                                                                                                                 
[rank1]:   File "/root/.cache/huggingface/modules/transformers_modules/TeleChat2-7B/modeling_telechat2.py", line 821, in forward                                                                        
[rank1]:     transformer_outputs = self.transformer(                                                                                                                                                    
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl                                                                
[rank1]:     return self._call_impl(*args, **kwargs)                                                                                                                                                    
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl                                                                        
[rank1]:     return forward_call(*args, **kwargs)                                                                                                                                                       
[rank1]:   File "/root/.cache/huggingface/modules/transformers_modules/TeleChat2-7B/modeling_telechat2.py", line 721, in forward                                                                        
[rank1]:     outputs = torch.utils.checkpoint.checkpoint(                                                                                                                                               
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/swift/trainers/arguments.py", line 49, in _new_checkpoint                                                                    
[rank1]:     return _old_checkpoint(*args, use_reentrant=use_reentrant_, **kwargs)                                                                                                                      
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/_compile.py", line 32, in inner                                                                                        
[rank1]:     return disable_fn(*args, **kwargs)                                                                                                                                                         
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 632, in _fn                                                                               
[rank1]:     return fn(*args, **kwargs)                                                                                                                                                                 
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/utils/checkpoint.py", line 489, in checkpoint                                                                          
[rank1]:     return CheckpointFunction.apply(function, preserve, *args)                                                                                                                                 
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/autograd/function.py", line 575, in apply                                                                              
[rank1]:     return super().apply(*args, **kwargs)  # type: ignore[misc]                                                                                                                                
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/utils/checkpoint.py", line 264, in forward                                                                             
[rank1]:     outputs = run_function(*args)   
[rank1]:   File "/root/.cache/huggingface/modules/transformers_modules/TeleChat2-7B/modeling_telechat2.py", line 717, in custom_forward                                                                 
[rank1]:     return module(*inputs, use_cache=use_cache, output_attentions=output_attentions)                                                                                                           
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl                                                                
[rank1]:     return self._call_impl(*args, **kwargs)                                                                                                                                                    
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl                                                                        
[rank1]:     return forward_call(*args, **kwargs)                                                                                                                                                       
[rank1]:   File "/root/.cache/huggingface/modules/transformers_modules/TeleChat2-7B/modeling_telechat2.py", line 551, in forward                                                                        
[rank1]:     attn_outputs = self.self_attention(                                                                                                                                                        
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl                                                                
[rank1]:     return self._call_impl(*args, **kwargs)                                                                                                                                                    
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl                                                                        
[rank1]:     return forward_call(*args, **kwargs)                                                                                                                                                       
[rank1]:   File "/root/.cache/huggingface/modules/transformers_modules/TeleChat2-7B/modeling_telechat2.py", line 493, in forward                                                                        
[rank1]:     context_layer = torch.bmm(attention_probs_reshaped, value_layer.transpose(0, 1))                                                                                                           
[rank1]: RuntimeError: expected scalar type BFloat16 but found Float                                                                                                                                    
Train:   0%|                                                                                                                                                                     | 0/10 [00:00<?, ?it/s]

应该是telechat2 7b模型的问题,我之前好像修改过参数,我试试modelscope官方原版的。经过测试,原版的一样有这个问题,麻烦大佬帮忙看下能否解决

@WjMessi1
Copy link
Author

不好意,我的应该和你的不一样,刚刚解决了,是我自己数据弄错了

可以看下您的运行参数吗?

@lonngxiang
Copy link

请问需要多大显卡资源能跑呢

@Jintao-Huang
Copy link
Collaborator

我这里测试是正常的
尝试升级一下ms-swift试试呢

大佬您好,我重新安装最新版本的ms-swift(3.0.3版本),运行下面的dpo指令:

NPROC_PER_NODE=2 CUDA_VISIBLE_DEVICES=0,1 swift rlhf --model_type telechat2 --rlhf_type dpo --model /data/Telechat/TeleChat2/TeleChat2-7B --dataset /data/Telechat/dpo_refusal_dataset_official.jsonl --num_train_epochs 10 --per_device_train_batch_size 1 --per_device_eval_batch_size 1 --learning_rate 1e-4 --lora_rank 8 --lora_alpha 32 --gradient_accumulation_steps 8 --eval_steps 10 --save_steps 10 --save_total_limit 5 --logging_steps 5 --max_length 2048 --output_dir output --ddp_find_unused_parameters true --warmup_ratio 0.05 --dataloader_num_workers 4 --deepspeed zero2

有新报错如下:

[ERROR:modelscope] The request model: unknown does not exist!                                                                                                                                           
[ERROR:modelscope] The request model: unknown does not exist!                                                                                                                                           
/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/swift/trainers/mixin.py:77: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `DPOTrainer.__init__`. Use `proc
essing_class` instead.                                                                                                                                                                                  
  super().__init__(                                                                                                                                                                                     
Detected kernel version 5.4.0, which is below the recommended minimum of 5.5.0; this can cause the process to hang. It is recommended to upgrade the kernel to the minimum version or higher.           
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.                                                                                                                   
[INFO:swift] The logging file will be saved in: /data/Telechat/TeleChat2/TeleChat2-7B/output/v5-20250123-145217/logging.jsonl                                                                           
[ERROR:modelscope] The request model: unknown does not exist!                                                                                                                                           
[ERROR:modelscope] The request model: unknown does not exist!                                                                                                                                           
/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/swift/trainers/mixin.py:77: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `DPOTrainer.__init__`. Use `proc
essing_class` instead.                                                                                                                                                                                  
  super().__init__(                                                                                                                                                                                     
You are using an old version of the checkpointing format that is deprecated (We will also silently ignore `gradient_checkpointing_kwargs` in case you passed it).Please update to the new format on your
 modeling file. To use the new format, you need to completely remove the definition of the method `_set_gradient_checkpointing` in your model.                                                          
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.                                                                                                                   
You are using an old version of the checkpointing format that is deprecated (We will also silently ignore `gradient_checkpointing_kwargs` in case you passed it).Please update to the new format on your
 modeling file. To use the new format, you need to completely remove the definition of the method `_set_gradient_checkpointing` in your model.                                                          
[rank1]: Traceback (most recent call last):                                                                                                                                                             
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/swift/cli/rlhf.py", line 5, in <module>                                                                                      
[rank1]:     rlhf_main()                                                                                                                                                                                
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/swift/llm/train/rlhf.py", line 92, in rlhf_main                                                                              
[rank1]:     return SwiftRLHF(args).main()                                                                                                                                                              
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/swift/llm/base.py", line 46, in main                                                                                         
[rank1]:     result = self.run()                                                                                                                                                                        
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/swift/llm/train/sft.py", line 137, in run                                                                                    
[rank1]:     return self.train(trainer)                                                                                                                                                                 
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/swift/llm/train/sft.py", line 189, in train                                                                                  
[rank1]:     trainer.train(trainer.args.resume_from_checkpoint)                                                                                                                                         
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/swift/trainers/mixin.py", line 261, in train                                                                                 
[rank1]:     res = super().train(*args, **kwargs)                                                                                                                                                       
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/transformers/trainer.py", line 2164, in train                                                                                
[rank1]:     return inner_training_loop(                                                                                                                                                                
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/transformers/trainer.py", line 2524, in _inner_training_loop
[rank1]:     tr_loss_step = self.training_step(model, inputs, num_items_in_batch)
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/transformers/trainer.py", line 3654, in training_step                                                              [150/1961]
[rank1]:     loss = self.compute_loss(model, inputs, num_items_in_batch=num_items_in_batch)                                                                                                             
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/swift/trainers/rlhf_trainer/rlhf_mixin.py", line 155, in compute_loss                                                        
[rank1]:     res = super().compute_loss(model, inputs, return_outputs=return_outputs)                                                                                                                   
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/trl/trainer/dpo_trainer.py", line 1489, in compute_loss                                                                      
[rank1]:     loss, metrics = self.get_batch_loss_metrics(model, inputs, train_eval="train")                                                                                                             
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/trl/trainer/dpo_trainer.py", line 1415, in get_batch_loss_metrics                                                            
[rank1]:     forward_output = self.concatenated_forward(model, batch)                                                                                                                                   
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/swift/trainers/rlhf_trainer/rlhf_mixin.py", line 122, in concatenated_forward                                                
[rank1]:     outputs = model(**model_kwargs, use_cache=False)                                                                                                                                           
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl                                                                
[rank1]:     return self._call_impl(*args, **kwargs)                                                                                                                                                    
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl                                                                        
[rank1]:     return forward_call(*args, **kwargs)                                                                                                                                                       
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/deepspeed/utils/nvtx.py", line 18, in wrapped_fn                                                                             
[rank1]:     ret_val = func(*args, **kwargs)                                                                                                                                                            
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/deepspeed/runtime/engine.py", line 1914, in forward                                                                          
[rank1]:     loss = self.module(*inputs, **kwargs)                                                                                                                                                      
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl                                                                
[rank1]:     return self._call_impl(*args, **kwargs)                                                                                                                                                    
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl                                                                        
[rank1]:     return forward_call(*args, **kwargs)                                                                                                                                                       
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/peft/peft_model.py", line 1719, in forward                                                                                   
[rank1]:     return self.base_model(                                                                                                                                                                    
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl                                                                
[rank1]:     return self._call_impl(*args, **kwargs)                                                                                                                                                    
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl                                                                        
[rank1]:     return forward_call(*args, **kwargs)                                                                                                                                                       
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/peft/tuners/tuners_utils.py", line 197, in forward                                                                           
[rank1]:     return self.model.forward(*args, **kwargs)                                                                                                                                                 
[rank1]:   File "/root/.cache/huggingface/modules/transformers_modules/TeleChat2-7B/modeling_telechat2.py", line 821, in forward                                                                        
[rank1]:     transformer_outputs = self.transformer(                                                                                                                                                    
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl                                                                
[rank1]:     return self._call_impl(*args, **kwargs)                                                                                                                                                    
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl                                                                        
[rank1]:     return forward_call(*args, **kwargs)                                                                                                                                                       
[rank1]:   File "/root/.cache/huggingface/modules/transformers_modules/TeleChat2-7B/modeling_telechat2.py", line 721, in forward                                                                        
[rank1]:     outputs = torch.utils.checkpoint.checkpoint(                                                                                                                                               
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/swift/trainers/arguments.py", line 49, in _new_checkpoint                                                                    
[rank1]:     return _old_checkpoint(*args, use_reentrant=use_reentrant_, **kwargs)                                                                                                                      
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/_compile.py", line 32, in inner                                                                                        
[rank1]:     return disable_fn(*args, **kwargs)                                                                                                                                                         
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 632, in _fn                                                                               
[rank1]:     return fn(*args, **kwargs)                                                                                                                                                                 
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/utils/checkpoint.py", line 489, in checkpoint                                                                          
[rank1]:     return CheckpointFunction.apply(function, preserve, *args)                                                                                                                                 
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/autograd/function.py", line 575, in apply                                                                              
[rank1]:     return super().apply(*args, **kwargs)  # type: ignore[misc]                                                                                                                                
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/utils/checkpoint.py", line 264, in forward                                                                             
[rank1]:     outputs = run_function(*args)   
[rank1]:   File "/root/.cache/huggingface/modules/transformers_modules/TeleChat2-7B/modeling_telechat2.py", line 717, in custom_forward                                                                 
[rank1]:     return module(*inputs, use_cache=use_cache, output_attentions=output_attentions)                                                                                                           
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl                                                                
[rank1]:     return self._call_impl(*args, **kwargs)                                                                                                                                                    
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl                                                                        
[rank1]:     return forward_call(*args, **kwargs)                                                                                                                                                       
[rank1]:   File "/root/.cache/huggingface/modules/transformers_modules/TeleChat2-7B/modeling_telechat2.py", line 551, in forward                                                                        
[rank1]:     attn_outputs = self.self_attention(                                                                                                                                                        
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl                                                                
[rank1]:     return self._call_impl(*args, **kwargs)                                                                                                                                                    
[rank1]:   File "/root/anaconda3/envs/vllm065/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl                                                                        
[rank1]:     return forward_call(*args, **kwargs)                                                                                                                                                       
[rank1]:   File "/root/.cache/huggingface/modules/transformers_modules/TeleChat2-7B/modeling_telechat2.py", line 493, in forward                                                                        
[rank1]:     context_layer = torch.bmm(attention_probs_reshaped, value_layer.transpose(0, 1))                                                                                                           
[rank1]: RuntimeError: expected scalar type BFloat16 but found Float                                                                                                                                    
Train:   0%|                                                                                                                                                                     | 0/10 [00:00<?, ?it/s]

应该是telechat2 7b模型的问题,我之前好像修改过参数,我试试modelscope官方原版的。经过测试,原版的一样有这个问题,麻烦大佬帮忙看下能否解决

--dtype float16 或者 float32试试

@xiezhipeng-git
Copy link

xiezhipeng-git commented Jan 28, 2025

@Jintao-Huang
最新的github main 分支代码
当使用json文件 作为dpo的数据的时候
由于数据里只有question chosen rejected
这三个key .ms-swift 在经过 core

def preprocess(self, row: Dict[str, Any]) -> Optional[Dict[str, Any]]:
        response = row.pop('response', None)
        if response is not None:
            if isinstance(response, (list, tuple)):
                # sometimes response is a list, pick one randomly
                response = self.random_state.choice(response)
        history = row.pop('history', None) or []
        query = row.pop('query', None)
        system = row.pop('system', None)
        if isinstance(history, str):  # e.g. "[['query1', 'response1']]"
            history = ast.literal_eval(history)
        history.append([query, response])

        row.update({'messages': history_to_messages(history, system)})
        return row

函数的时候会认为没有历史数据。然后增加row["message"] = []
接着在core 里

for r in row:
                    self._check_messages(r)
                    self._check_rejected_response(r)
                    self._cast_images(r)

在_check_messages 内部

def _check_messages(row: Dict[str, Any]) -> None:
        if 'messages' not in row:
            return
        messages = row['messages']
        assert len(messages) > 0, f'messages: {messages}'
        # fix swift/SlimOrca
        for message in messages:
            keys = set(message.keys()) - {'role', 'content'}
            for key in keys:
                message.pop(key)

        if messages[0]['role'] == 'system':
            messages = messages[1:]
        if messages and messages[0]['role'] == 'assistant':
            messages = [{'role': 'user', 'content': ''}] + messages  # pretrain
        for user_message, assistant_message in zip(messages[::2], messages[1::2]):
            if (user_message['role'] not in {'user', 'tool'} or 'content' not in user_message
                    or user_message['content'] is None):
                raise ValueError(f'user_message: {user_message}')
            if (assistant_message['role'] not in {'assistant'} or 'content' not in assistant_message
                    or assistant_message['content'] in {'', None}):
                raise ValueError(f'assistant_message: {assistant_message}')

由于已经补充了messages key 导致代码在

        assert len(messages) > 0, f'messages: {messages}'

直接报错。这应该不是数据集的问题吧。虽然数据集是我自己制作的。但是这里的逻辑已经冲突了啊。

还有另一个值得讨论的问题。即ms-swift的代码里。dpo 数据作为选择与被选择的答案。有可能两个答案都是错误的。只是选择一个相对好的。但是我看代码里是直接把这些选择当做普通正确答案来命名的。(后面的处理还没看过代码)不知道这里的相关的代码的命名是否不够合适?(如果后续的处理需要统一才这样命名倒是没有影响)我就是提醒下

    def _check_rejected_response(row: Dict[str, Any]) -> None:
        if 'rejected_messages' in row:
            chosen_messages = row['messages']
            rejected_messages = row['rejected_messages']
            messages = []
            rejected_response = None
            for chosen_user, chosen_assistant, rejected_user, rejected_assistant in zip(
                    chosen_messages[::2], chosen_messages[1::2], rejected_messages[::2], rejected_messages[1::2]):
                assert chosen_user == rejected_user
                messages.append(chosen_user)
                messages.append(chosen_assistant)
                if chosen_assistant != rejected_assistant:
                    rejected_response = rejected_assistant['content']
            row['messages'] = messages
            row['rejected_response'] = rejected_response

        if 'rejected_response' in row:
            messages = row['messages']
            rejected_response = row['rejected_response']
            if rejected_response is None or rejected_response == messages[-1]['content']:
                raise ValueError(f'rejected_response: {rejected_response}')

@Jintao-Huang
Copy link
Collaborator

history.append([query, response])
row.update({'messages': history_to_messages(history, system)})

query和response也会放入history的

@xiezhipeng-git
Copy link

xiezhipeng-git commented Jan 28, 2025

history.append([query, response]) row.update({'messages': history_to_messages(history, system)})

query和response也会放入history的

但是我这里报错了。得到的是[]
因为history 与 system 都是None .意思是必须有system 数据?
但是我看代码里的示例
https://www.modelscope.cn/datasets/swift/zhihu_rlhf_3k/dataPeview
没有system 这个数据啊。
然后另一个dpo示例数据。格式完全不一样
https://modelscope.cn/datasets/hjh0119/shareAI-Llama3-DPO-zh-en-emoji/dataPeview

@xiezhipeng-git
Copy link

参考这里呢:https://swift.readthedocs.io/zh-cn/latest/Customization/%E8%87%AA%E5%AE%9A%E4%B9%89%E6%95%B0%E6%8D%AE%E9%9B%86.html#dpo-orpo-cpo-simpo-rm

好的,我试一下。不过如果这样的话,说明代码里的示例数据需要修改了

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants