Skip to content

Actions: huggingface/trl

Hugging Face Issue Labeler

Actions

Loading...
Loading

Show workflow options

Create status badge

Loading
83 workflow runs
83 workflow runs

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

feat(GRPOTrainer): reward_func return None to skip
Hugging Face Issue Labeler #83: Issue #2737 opened by ctjlewis
February 2, 2025 08:25 35s
February 2, 2025 08:25 35s
PLZ make padding_free for DataCollatorForChatML.
Hugging Face Issue Labeler #82: Issue #2736 opened by YooSungHyun
February 2, 2025 05:44 36s
February 2, 2025 05:44 36s
SFTvsRL SFT Memorizes, RL Generalizes
Hugging Face Issue Labeler #81: Issue #2735 opened by NickyDark1
February 2, 2025 03:56 22s
February 2, 2025 03:56 22s
GRPO Trainer supports VLMs
Hugging Face Issue Labeler #80: Issue #2734 opened by sunildkumar
February 2, 2025 02:59 27s
February 2, 2025 02:59 27s
DPOTrainer Loss
Hugging Face Issue Labeler #79: Issue #2733 opened by jeromeku
February 2, 2025 02:39 23s
February 2, 2025 02:39 23s
GKD Example why do not use labels?
Hugging Face Issue Labeler #78: Issue #2732 opened by YooSungHyun
February 2, 2025 02:39 42s
February 2, 2025 02:39 42s
Latest TRL code = significantly worse rewards for GRPO training
Hugging Face Issue Labeler #77: Issue #2731 opened by abacaj
February 2, 2025 01:18 24s
February 2, 2025 01:18 24s
Training Agents with GRPO
Hugging Face Issue Labeler #76: Issue #2723 opened by August-murr
January 31, 2025 19:47 27s
January 31, 2025 19:47 27s
OOM for 7B model on A100 80Gb
Hugging Face Issue Labeler #75: Issue #2719 opened by JohnConnor123
January 31, 2025 13:17 36s
January 31, 2025 13:17 36s
AttributeError: 'AutoModelForCausalLMWithValueHead' object has no attribute 'base_model_prefix'
Hugging Face Issue Labeler #74: Issue #2718 opened by Tarak200
January 31, 2025 10:08 52s
January 31, 2025 10:08 52s
GRPO for RL on agent trajectories
Hugging Face Issue Labeler #73: Issue #2715 opened by korbinian-hoermann
January 31, 2025 09:09 51s
January 31, 2025 09:09 51s
Isn't the reward *minimized* when len(completion)==20 if this is the reward function?
Hugging Face Issue Labeler #72: Issue #2714 opened by cfpark00
January 31, 2025 09:03 22s
January 31, 2025 09:03 22s
GRPO with tool calling
Hugging Face Issue Labeler #71: Issue #2712 opened by accupham
January 31, 2025 07:25 26s
January 31, 2025 07:25 26s
LoRA 'trainable params: 0'
Hugging Face Issue Labeler #70: Issue #2711 opened by shannonruxin
January 31, 2025 04:50 28s
January 31, 2025 04:50 28s
Examples in training VDPO on llava1.6
Hugging Face Issue Labeler #69: Issue #2710 opened by lucasjinreal
January 31, 2025 04:22 42s
January 31, 2025 04:22 42s
GRPO memory bottleneck from num_generations in compute_loss
Hugging Face Issue Labeler #68: Issue #2709 opened by willccbb
January 31, 2025 03:54 40s
January 31, 2025 03:54 40s
PPOTrainer + LoRA and Continued Training
Hugging Face Issue Labeler #67: Issue #2707 opened by kooryan
January 30, 2025 20:19 37s
January 30, 2025 20:19 37s
Multi-GPU sampling for vLLM in GRPO Trainer
Hugging Face Issue Labeler #66: Issue #2706 opened by nch0w
January 30, 2025 20:09 25s
January 30, 2025 20:09 25s
January 30, 2025 19:09 34s
GRPO: Why does loss start at 0 for first K steps and then increase over time?
Hugging Face Issue Labeler #64: Issue #2703 opened by arnavgarg1
January 30, 2025 18:27 28s
January 30, 2025 18:27 28s
Exposing GenerationConfig in the GRPO Trainer
Hugging Face Issue Labeler #63: Issue #2702 opened by Superskyyy
January 30, 2025 18:00 28s
January 30, 2025 18:00 28s
Allow pretokenized dataset in GRPO Trainer
Hugging Face Issue Labeler #62: Issue #2701 opened by Superskyyy
January 30, 2025 17:57 27s
January 30, 2025 17:57 27s
GRPO VLLM does not work with Lora
Hugging Face Issue Labeler #61: Issue #2698 opened by gagan3012
January 30, 2025 16:03 44s
January 30, 2025 16:03 44s
I cannot launch PPOTrainning script with accelerate launch
Hugging Face Issue Labeler #60: Issue #2696 opened by daehuikim
January 30, 2025 15:38 30s
January 30, 2025 15:38 30s
OOM 8xH100 using latest GRPO code with vLLM
Hugging Face Issue Labeler #59: Issue #2688 opened by abacaj
January 30, 2025 05:55 31s
January 30, 2025 05:55 31s