Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

请问有qwen2.5的训练脚本吗 #979

Open
liguoyu666 opened this issue Dec 27, 2024 · 1 comment
Open

请问有qwen2.5的训练脚本吗 #979

liguoyu666 opened this issue Dec 27, 2024 · 1 comment

Comments

@liguoyu666
Copy link

我用qwen1.5的脚本修改后训练,训练完成后转换模型时报错:

root@dsw-495481-6669bbb757-kdfwx:/mnt/workspace/xtuner_train# xtuner convert pth_to_hf qwen_train.py ./work_dirs/qwen_train/iter_723.pth ./work_dirs/hf_qwen_model [2024-12-27 10:52:31,101] [INFO] [real_accelerator.py:161:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2024-12-27 10:52:34,600] [INFO] [real_accelerator.py:161:get_accelerator] Setting ds_accelerator to cuda (auto detect) low_cpu_mem_usagewas None, now set to True since model is quantized. Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:03<00:00, 1.18s/it] Some weights of Qwen2ForCausalLM were not initialized from the model checkpoint at ./qwen2.5-3b-instruct and are newly initialized: ['lm_head.weight'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. Traceback (most recent call last): File "/opt/conda/lib/python3.10/site-packages/xtuner/tools/model_converters/pth_to_hf.py", line 139, in <module> main() File "/opt/conda/lib/python3.10/site-packages/xtuner/tools/model_converters/pth_to_hf.py", line 95, in main model = BUILDER.build(cfg.model) File "/opt/conda/lib/python3.10/site-packages/mmengine/registry/registry.py", line 570, in build return self.build_func(cfg, *args, **kwargs, registry=self) File "/opt/conda/lib/python3.10/site-packages/mmengine/registry/build_functions.py", line 121, in build_from_cfg obj = obj_cls(**args) # type: ignore File "/opt/conda/lib/python3.10/site-packages/xtuner/model/sft.py", line 83, in __init__ self.llm = self.build_llm_from_cfg(llm, use_varlen_attn, File "/opt/conda/lib/python3.10/site-packages/xtuner/model/sft.py", line 126, in build_llm_from_cfg llm = self._build_from_cfg_or_module(llm) File "/opt/conda/lib/python3.10/site-packages/xtuner/model/sft.py", line 248, in _build_from_cfg_or_module return BUILDER.build(cfg_or_mod) File "/opt/conda/lib/python3.10/site-packages/mmengine/registry/registry.py", line 570, in build return self.build_func(cfg, *args, **kwargs, registry=self) File "/opt/conda/lib/python3.10/site-packages/mmengine/registry/build_functions.py", line 121, in build_from_cfg obj = obj_cls(**args) # type: ignore File "/opt/conda/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 563, in from_pretrained return model_class.from_pretrained( File "/opt/conda/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3589, in from_pretrained dispatch_model(model, **device_map_kwargs) File "/opt/conda/lib/python3.10/site-packages/accelerate/big_modeling.py", line 399, in dispatch_model attach_align_device_hook_on_blocks( File "/opt/conda/lib/python3.10/site-packages/accelerate/hooks.py", line 517, in attach_align_device_hook_on_blocks add_hook_to_module(module, hook) File "/opt/conda/lib/python3.10/site-packages/accelerate/hooks.py", line 156, in add_hook_to_module module = hook.init_hook(module) File "/opt/conda/lib/python3.10/site-packages/accelerate/hooks.py", line 254, in init_hook set_module_tensor_to_device(module, name, self.execution_device) File "/opt/conda/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 306, in set_module_tensor_to_device raise ValueError(f"{tensor_name} is on the meta device, we need avalueto put in on {device}.") ValueError: weight is on the meta device, we need avalue to put in on 0.

@liguoyu666
Copy link
Author

liguoyu666 commented Dec 27, 2024

主要修改部份:
`# Model
pretrained_model_name_or_path = r'./qwen2.5-3b-instruct'

Data

alpaca_en_path = 'tatsu-lab/alpaca'

data_path = r'data/multi_turn_dataset_2.json'

SYSTEM = 'XXXX'
evaluation_inputs = ['XXXX']

dataset=dict(type=load_dataset, path=alpaca_en_path),

dataset=dict(type=load_dataset, path='json', data_files=dict(train=data_path)),

dataset_map_fn=None,

set visualizer

from mmengine.visualization import Visualizer, TensorboardVisBackend
visualizer = dict(type=Visualizer, vis_backends=[dict(type=TensorboardVisBackend)])`

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant