Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: Please specify target_modules in peft_config #2374

Open
thuytrang32 opened this issue Jan 7, 2025 · 0 comments
Open

ValueError: Please specify target_modules in peft_config #2374

thuytrang32 opened this issue Jan 7, 2025 · 0 comments

Comments

@thuytrang32
Copy link

What happened?

I used the code in this instruction to fine-tune llm https://www.kubeflow.org/docs/components/training/user-guides/fine-tuning/
However , i encounterd error : [rank0]: ValueError: Please specify target_modules in peft_config . I tried to delete the lora config but that error still existed.

image
image

`
import transformers
from peft import LoraConfig

from kubeflow.training import TrainingClient
from kubeflow.storage_initializer.hugging_face import (
HuggingFaceModelParams,
HuggingFaceTrainerParams,
HuggingFaceDatasetParams,
)

TrainingClient().train(
name="fine-tune-bert",
# BERT model URI and type of Transformer to train it.

storage_config=
{
      "size": "5Gi",
      "storage_class": "nfs-client",
},

model_provider_parameters=HuggingFaceModelParams(
    model_uri="hf://distilbert/distilbert-base-uncased",
    transformer_type=transformers.AutoModelForSequenceClassification,
),

# Use 3000 samples from Yelp dataset.
dataset_provider_parameters=HuggingFaceDatasetParams(
    #repo_id="yelp_review_full",
    repo_id="yelp_review_full",
    split="train[:100]",
),
# Specify HuggingFace Trainer parameters. In this example, we will skip evaluation and model checkpoints.
trainer_parameters=HuggingFaceTrainerParams(
    training_parameters=transformers.TrainingArguments(
        output_dir="test_trainer",
        save_strategy="no",
        evaluation_strategy="no",
        do_eval=False,
        disable_tqdm=True,
        log_level="info",
        #ddp_backend="gloo",
    ),
    
    # Set LoRA config to reduce number of trainable model parameters.
    
    #lora_config=LoraConfig(
        #r=8,
        #lora_alpha=8,
        #lora_dropout=0.1,
        #bias="none",
        #target_modules=["encoder.layer.*.attention.self.query", "encoder.layer.*.attention.self.key"]
    #),
    
),
num_workers=2, # nnodes parameter for torchrun command.
num_procs_per_worker=20, # nproc-per-node parameter for torchrun command.
resources_per_worker={
    "cpu": 20,
    "memory": "20G",
},

)
`

What did you expect to happen?

fine-tuning process passes successfully

Environment

Kubernetes version:

$ kubectl version
Client Version: v1.29.6+k3s2
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.29.6+k3s2

Training Operator version:

$ kubectl get pods -n kubeflow -l control-plane=kubeflow-training-operator -o jsonpath="{.items[*].spec.containers[*].image}"
kubeflow/training-operator:latest

Training Operator Python SDK version:

$ pip show kubeflow-training
Name: kubeflow-training
Version: 1.8.1
Summary: Training Operator Python SDK
Home-page: https://github.com/kubeflow/training-operator/tree/master/sdk/python
Author: Kubeflow Authors
Author-email: [email protected]
License: Apache License Version 2.0
Location: /opt/conda/lib/python3.11/site-packages
Requires: certifi, kubernetes, retrying, setuptools, six, urllib3
Required-by: 

Impacted by this bug?

👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant