Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ORPO on SFT dataset #2570

Open
7 of 9 tasks
vitalyshalumov opened this issue Jan 15, 2025 · 0 comments
Open
7 of 9 tasks

ORPO on SFT dataset #2570

vitalyshalumov opened this issue Jan 15, 2025 · 0 comments
Labels
🏋 ORPO Related to ORPO ❓ question Seeking clarification or more information

Comments

@vitalyshalumov
Copy link

System Info

Hello,
As I understand it ORPO leverages SFT and preference in one step.
But what if I have only SFT data without prefenrece?
Can I still use ORPO with unpaired_preference?
Or the way to go is SFT and then ORPO on a prefenrece dataset if I have one?

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder
  • My own task or dataset (give details below)

Reproduction

from trl import ...

outputs:

Traceback (most recent call last):
  File "example.py", line 42, in <module>
    ...

Expected behavior

How to leverage ORPO

Checklist

  • I have checked that my issue isn't already filed (see open issues)
  • I have included my system information
  • Any code provided is minimal, complete, and reproducible (more on MREs)
  • Any code provided is properly formatted in code blocks, (no screenshot, more on code blocks)
  • Any traceback provided is complete
@August-murr August-murr added ❓ question Seeking clarification or more information 🏋 ORPO Related to ORPO labels Jan 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏋 ORPO Related to ORPO ❓ question Seeking clarification or more information
Projects
None yet
Development

No branches or pull requests

2 participants