Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[DPO] add reference log-prob outputs in DPO (#521)
## Summary Since the DPO uses a reference model we also need to return the reference logprobs in DPO <!--- ## Details This is an optional section; is there anything specific that reviewers should be aware of? ---> ## Testing Done <!--- This is a required section; please describe how this change was tested. ---> <!-- Replace BLANK with your device type. For example, A100-80G-PCIe Complete the following tasks before sending your PR, and replace `[ ]` with `[x]` to indicate you have done them. --> - Hardware Type: <BLANK> - [x] run `make test` to ensure correctness - [x] run `make checkstyle` to ensure code style - [ ] run `make test-convergence` to ensure convergence --------- Co-authored-by: Shao Tang <[email protected]>
- Loading branch information