We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
I used the code in this instruction to fine-tune llm https://www.kubeflow.org/docs/components/training/user-guides/fine-tuning/ However , i encounterd error : [rank0]: ValueError: Please specify target_modules in peft_config . I tried to delete the lora config but that error still existed.
` import transformers from peft import LoraConfig
from kubeflow.training import TrainingClient from kubeflow.storage_initializer.hugging_face import ( HuggingFaceModelParams, HuggingFaceTrainerParams, HuggingFaceDatasetParams, )
TrainingClient().train( name="fine-tune-bert", # BERT model URI and type of Transformer to train it.
storage_config= { "size": "5Gi", "storage_class": "nfs-client", }, model_provider_parameters=HuggingFaceModelParams( model_uri="hf://distilbert/distilbert-base-uncased", transformer_type=transformers.AutoModelForSequenceClassification, ), # Use 3000 samples from Yelp dataset. dataset_provider_parameters=HuggingFaceDatasetParams( #repo_id="yelp_review_full", repo_id="yelp_review_full", split="train[:100]", ), # Specify HuggingFace Trainer parameters. In this example, we will skip evaluation and model checkpoints. trainer_parameters=HuggingFaceTrainerParams( training_parameters=transformers.TrainingArguments( output_dir="test_trainer", save_strategy="no", evaluation_strategy="no", do_eval=False, disable_tqdm=True, log_level="info", #ddp_backend="gloo", ), # Set LoRA config to reduce number of trainable model parameters. #lora_config=LoraConfig( #r=8, #lora_alpha=8, #lora_dropout=0.1, #bias="none", #target_modules=["encoder.layer.*.attention.self.query", "encoder.layer.*.attention.self.key"] #), ), num_workers=2, # nnodes parameter for torchrun command. num_procs_per_worker=20, # nproc-per-node parameter for torchrun command. resources_per_worker={ "cpu": 20, "memory": "20G", },
) `
fine-tuning process passes successfully
Kubernetes version:
$ kubectl version Client Version: v1.29.6+k3s2 Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3 Server Version: v1.29.6+k3s2
Training Operator version:
$ kubectl get pods -n kubeflow -l control-plane=kubeflow-training-operator -o jsonpath="{.items[*].spec.containers[*].image}" kubeflow/training-operator:latest
Training Operator Python SDK version:
$ pip show kubeflow-training Name: kubeflow-training Version: 1.8.1 Summary: Training Operator Python SDK Home-page: https://github.com/kubeflow/training-operator/tree/master/sdk/python Author: Kubeflow Authors Author-email: [email protected] License: Apache License Version 2.0 Location: /opt/conda/lib/python3.11/site-packages Requires: certifi, kubernetes, retrying, setuptools, six, urllib3 Required-by:
👍
The text was updated successfully, but these errors were encountered:
No branches or pull requests
What happened?
I used the code in this instruction to fine-tune llm https://www.kubeflow.org/docs/components/training/user-guides/fine-tuning/
However , i encounterd error : [rank0]: ValueError: Please specify target_modules in peft_config . I tried to delete the lora config but that error still existed.
`
import transformers
from peft import LoraConfig
from kubeflow.training import TrainingClient
from kubeflow.storage_initializer.hugging_face import (
HuggingFaceModelParams,
HuggingFaceTrainerParams,
HuggingFaceDatasetParams,
)
TrainingClient().train(
name="fine-tune-bert",
# BERT model URI and type of Transformer to train it.
)
`
What did you expect to happen?
fine-tuning process passes successfully
Environment
Kubernetes version:
Training Operator version:
$ kubectl get pods -n kubeflow -l control-plane=kubeflow-training-operator -o jsonpath="{.items[*].spec.containers[*].image}" kubeflow/training-operator:latest
Training Operator Python SDK version:
Impacted by this bug?
👍
The text was updated successfully, but these errors were encountered: