From 724acb9716d5124c83993a3d92b6e6bb8bb73668 Mon Sep 17 00:00:00 2001
From: Sergio Paniego Blanco <sergiopaniegoblanco@gmail.com>
Date: Thu, 6 Feb 2025 18:28:05 +0100
Subject: [PATCH] =?UTF-8?q?=F0=9F=92=A1=20Add=20'Post=20training=20an=20LL?=
 =?UTF-8?q?M=20for=20reasoning=20with=20GRPO=20in=20TRL'=20tutorial=20(#27?=
 =?UTF-8?q?85)?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

---
 docs/source/community_tutorials.md | 1 +
 1 file changed, 1 insertion(+)

diff --git a/docs/source/community_tutorials.md b/docs/source/community_tutorials.md
index b23f60ccff..0cd5e8e25f 100644
--- a/docs/source/community_tutorials.md
+++ b/docs/source/community_tutorials.md
@@ -6,6 +6,7 @@ Community tutorials are made by active members of the Hugging Face community who
 
 | Task | Class | Description | Author | Tutorial | Colab |
 | --- | --- | --- | --- | --- | --- |
+| Reinforcement Learning | [`GRPOTrainer`] | Post training an LLM for reasoning with GRPO in TRL | [Sergio Paniego](https://huggingface.co/sergiopaniego) | [Link](https://huggingface.co/learn/cookbook/fine_tuning_llm_grpo_trl) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/cookbook/blob/main/notebooks/en/fine_tuning_llm_grpo_trl.ipynb) |
 | Reinforcement Learning | [`GRPOTrainer`] | Mini-R1: Reproduce Deepseek R1 „aha moment“ a RL tutorial | [Philipp Schmid](https://huggingface.co/philschmid) | [Link](https://www.philschmid.de/mini-deepseek-r1) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/philschmid/deep-learning-pytorch-huggingface/blob/main/training/mini-deepseek-r1-aha-grpo.ipynb) |
 | Instruction tuning | [`SFTTrainer`] | Fine-tuning Google Gemma LLMs using ChatML format with QLoRA | [Philipp Schmid](https://huggingface.co/philschmid) | [Link](https://www.philschmid.de/fine-tune-google-gemma) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/philschmid/deep-learning-pytorch-huggingface/blob/main/training/gemma-lora-example.ipynb) |
 | Structured Generation | [`SFTTrainer`] | Fine-tuning Llama-2-7B to generate Persian product catalogs in JSON using QLoRA and PEFT | [Mohammadreza Esmaeilian](https://huggingface.co/Mohammadreza) | [Link](https://huggingface.co/learn/cookbook/en/fine_tuning_llm_to_generate_persian_product_catalogs_in_json_format) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/cookbook/blob/main/notebooks/en/fine_tuning_llm_to_generate_persian_product_catalogs_in_json_format.ipynb) |