From b643d7f277022794bbab37fd782c0d68e7281b39 Mon Sep 17 00:00:00 2001 From: Michael Benayoun Date: Tue, 23 Jan 2024 15:57:52 +0100 Subject: [PATCH] Docs nits (#428) * Add link to colab and studio lab * Rewording: paragraph -> section * Add explaination about hallucination * Add link to notebooks --- docs/source/tutorials/fine_tune_bert.mdx | 2 ++ docs/source/tutorials/fine_tune_llama_7b.mdx | 2 ++ docs/source/tutorials/llama2-13b-chatbot.mdx | 10 ++++++-- docs/source/tutorials/stable_diffusion.mdx | 6 ++++- .../text-generation/llama2-13b-chatbot.ipynb | 24 +++++++++++++++---- 5 files changed, 37 insertions(+), 7 deletions(-) diff --git a/docs/source/tutorials/fine_tune_bert.mdx b/docs/source/tutorials/fine_tune_bert.mdx index bd4679547..a189e72ad 100644 --- a/docs/source/tutorials/fine_tune_bert.mdx +++ b/docs/source/tutorials/fine_tune_bert.mdx @@ -16,6 +16,8 @@ limitations under the License. # Fine-tune BERT for Text Classification on AWS Trainium +*There is a notebook version of that tutorial [here](https://github.com/huggingface/optimum-neuron/blob/main/notebooks/text-classification/notebook.ipynb)*. + This tutorial will help you to get started with [AWS Trainium](https://aws.amazon.com/machine-learning/trainium/?nc1=h_ls) and Hugging Face Transformers. It will cover how to set up a Trainium instance on AWS, load & fine-tune a transformers model for text-classification You will learn how to: diff --git a/docs/source/tutorials/fine_tune_llama_7b.mdx b/docs/source/tutorials/fine_tune_llama_7b.mdx index dc0df0b65..ddd814885 100644 --- a/docs/source/tutorials/fine_tune_llama_7b.mdx +++ b/docs/source/tutorials/fine_tune_llama_7b.mdx @@ -16,6 +16,8 @@ limitations under the License. # Fine-tune and Test Llama 2 7B on AWS Trainium +*There is a notebook version of that tutorial [here](https://github.com/huggingface/optimum-neuron/blob/main/notebooks/text-generation/llama2-7b-fine-tuning.ipynb)*. + This tutorial will teach you how to fine-tune open LLMs like [Llama 2](https://huggingface.co/meta-llama/Llama-2-7b-hf) on AWS Trainium. In our example, we are going to leverage Hugging Face https://huggingface.co/docs/optimum-neuron/index, [Transformers](https://huggingface.co/docs/transformers/index) and https://huggingface.co/docs/datasets/index. You will learn how to: diff --git a/docs/source/tutorials/llama2-13b-chatbot.mdx b/docs/source/tutorials/llama2-13b-chatbot.mdx index 80132859f..40458786e 100644 --- a/docs/source/tutorials/llama2-13b-chatbot.mdx +++ b/docs/source/tutorials/llama2-13b-chatbot.mdx @@ -47,7 +47,7 @@ When exporting the model, we will specify two sets of parameters: Depending on your choice of parameters and inferentia host, this may take from a few minutes to more than an hour. -For your convenience, we host a pre-compiled version of that model on the Hugging Face hub, so you can skip the export and start using the model immediately in paragraph 2. +For your convenience, we host a pre-compiled version of that model on the Hugging Face hub, so you can skip the export and start using the model immediately in section 2. ```python @@ -129,7 +129,7 @@ using a *inf2.24xlarge* instance. Once your model has been exported, you can generate text using the transformers library, as it has been described in [detail in this post](https://huggingface.co/blog/how-to-generate). -If as suggested you skipped the first paragraph, don't worry: we will use a precompiled model already present on the hub instead. +If as suggested you skipped the first section, don't worry: we will use a precompiled model already present on the hub instead. ```python @@ -249,3 +249,9 @@ print(chat("My favorite color is blue. My favorite fruit is strawberry.", histor print(chat("Name a fruit that is on my favorite colour.", history, max_tokens)) print(chat("What is the colour of my favorite fruit ?", history, max_tokens)) ``` + + + +While very powerful, Large language models can sometimes *hallucinate*. We call *hallucinations* generated content that is irrelevant or made-up but presented by the model as if it was accurate. This is a flaw of LLMs and is not a side effect of using them on Trainium / Inferentia. + + diff --git a/docs/source/tutorials/stable_diffusion.mdx b/docs/source/tutorials/stable_diffusion.mdx index 6dcf61de0..60a775966 100644 --- a/docs/source/tutorials/stable_diffusion.mdx +++ b/docs/source/tutorials/stable_diffusion.mdx @@ -18,6 +18,8 @@ limitations under the License. ## Stable Diffusion +*There is a notebook version of that tutorial [here](https://github.com/huggingface/optimum-neuron/blob/main/notebooks/stable-diffusion/stable-diffusion-txt2img.ipynb)*. + 🤗 `Optimum` extends `Diffusers` to support inference on the second generation of Neuron devices(powering Trainium and Inferentia 2). It aims at inheriting the ease of Diffusers on Neuron. To get started, make sure you have [configured your inf2 / trn1 instance](../installation), and installed optimum: @@ -173,6 +175,8 @@ image.save("cat_on_bench.png") ## Stable Diffusion XL +*There is a notebook version of that tutorial [here](https://github.com/huggingface/optimum-neuron/blob/main/notebooks/stable-diffusion/stable-diffusion-xl-txt2img.ipynb)*. + Stable Diffusion XL (SDXL) is a latent diffusion model for text-to-image. Compared to the previous versions of Stable Diffusion models, it improves the quality of generated images with a times larger UNet. ### Compile Stable Diffusion XL @@ -465,4 +469,4 @@ Inf2 instances contain one or more Neuron devices, and each Neuron device includ -Are there any other stable diffusion features that you want us to support in 🤗`Optimum-neuron`? Please file an issue to [`Optimum-neuron` Github repo](https://github.com/huggingface/optimum-neuron) or discuss with us on [HuggingFace’s community forum](https://discuss.huggingface.co/c/optimum/), cheers 🤗 ! \ No newline at end of file +Are there any other stable diffusion features that you want us to support in 🤗`Optimum-neuron`? Please file an issue to [`Optimum-neuron` Github repo](https://github.com/huggingface/optimum-neuron) or discuss with us on [HuggingFace’s community forum](https://discuss.huggingface.co/c/optimum/), cheers 🤗 ! diff --git a/notebooks/text-generation/llama2-13b-chatbot.ipynb b/notebooks/text-generation/llama2-13b-chatbot.ipynb index 9ee785cc5..59ece3802 100644 --- a/notebooks/text-generation/llama2-13b-chatbot.ipynb +++ b/notebooks/text-generation/llama2-13b-chatbot.ipynb @@ -19,7 +19,7 @@ "\n", "## Prerequisite: Setup AWS environment\n", "\n", - "*you can skip that paragraph if you are already running this notebook on your instance.*\n", + "*you can skip that section if you are already running this notebook on your instance.*\n", "\n", "In this example, we will use the *inf2.48xlarge* instance with 12 Neuron devices, corresponding to 24 Neuron Cores and the [Hugging Face Neuron Deep Learning AMI](https://aws.amazon.com/marketplace/pp/prodview-gr3e6yiscria2).\n", "\n", @@ -91,7 +91,7 @@ "\n", "Depending on your choice of parameters and inferentia host, this may take from a few minutes to more than an hour.\n", "\n", - "For your convenience, we host a pre-compiled version of that model on the Hugging Face hub, so you can skip the export and start using the model immediately in paragraph 2." + "For your convenience, we host a pre-compiled version of that model on the Hugging Face hub, so you can skip the export and start using the model immediately in section 2." ] }, { @@ -226,7 +226,7 @@ "\n", "Once your model has been exported, you can generate text using the transformers library, as it has been described in [detail in this post](https://huggingface.co/blog/how-to-generate).\n", "\n", - "If as suggested you skipped the first paragraph, don't worry: we will use a precompiled model already present on the hub instead." + "If as suggested you skipped the first section, don't worry: we will use a precompiled model already present on the hub instead." ] }, { @@ -418,6 +418,22 @@ "source": [ "print(chat(\"What is the colour of my favorite fruit ?\", history, max_tokens))" ] + }, + { + "cell_type": "markdown", + "id": "38df6da1", + "metadata": {}, + "source": [ + "**Warning**: While very powerful, Large language models can sometimes *hallucinate*. We call *hallucinations* generated content that is irrelevant or made-up but presented by the model as if it was accurate. This is a flaw of LLMs and is not a side effect of using them on Trainium / Inferentia." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "0f8b4dc6", + "metadata": {}, + "outputs": [], + "source": [] } ], "metadata": { @@ -436,7 +452,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.10.12" + "version": "3.9.16" } }, "nbformat": 4,