Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
73 changes: 32 additions & 41 deletions philschmid/2023-04-13-bloom-sagemaker-peft.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -5,21 +5,20 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Efficient Large Language Model training with LoRA and Hugging Face\n",
"# 使用 LoRA Hugging Face 进行高效的大型语言模型训练\n",
"\n",
"In this sagemaker example, we are going to learn how to apply [Low-Rank Adaptation of Large Language Models (LoRA)](https://arxiv.org/abs/2106.09685) to fine-tune BLOOMZ (7 billion parameter version instruction tuned version of BLOOM) on a single GPU. We are going to leverage Hugging Face [Transformers](https://huggingface.co/docs/transformers/index), [Accelerate](https://huggingface.co/docs/accelerate/index), and [PEFT](https://github.com/huggingface/peft). \n",
"在这个 sagemaker 示例中,我们将学习如何应用 [Low-Rank Adaptation of Large Language Models (LoRA)](https://arxiv.org/abs/2106.09685) 来微调 BLOOMZ (BLOOM 70亿参数版本指令调优版本) 在单块GPU上。 我们将利用 Hugging Face [Transformers](https://huggingface.co/docs/transformers/index), [Accelerate](https://huggingface.co/docs/accelerate/index), 以及 [PEFT](https://github.com/huggingface/peft). \n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line violates grammar rules, please check.

"\n",
"You will learn how to:\n",
"你将学到如何:\n",
"\n",
"1. Setup Development Environment\n",
"2. Load and prepare the dataset\n",
"3. Fine-Tune BLOOM with LoRA and bnb int-8 on Amazon SageMaker\n",
"4. Deploy the model to Amazon SageMaker Endpoint\n",
"1. 设置开发环境\n",
"2. 加载并准备数据集\n",
"3. 在 Amazon SageMaker 上使用 LoRA bnb int-8 来微调 BLOOM\n",
"4. 将模型部署到 Amazon SageMaker 端点\n",
"\n",
"### Quick intro: PEFT or Parameter Efficient Fine-tunin\n",
"\n",
"[PEFT](https://github.com/huggingface/peft), or Parameter Efficient Fine-tuning, is a new open-source library from Hugging Face to enable efficient adaptation of pre-trained language models (PLMs) to various downstream applications without fine-tuning all the model's parameters. PEFT currently includes techniques for:\n",
"### 简介:PEFT 或参数高效微调\n",
"\n",
"[PEFT](https://github.com/huggingface/peft), 或 Parameter Efficient Fine-tuning,是 Hugging Face 的一个新的开源库,可以使预训练语言模型 (PLM) 高效适应各种下游应用程序,而无需微调模型的所有参数。 PEFT 目前包括以下技术:\n",
"- LoRA: [LORA: LOW-RANK ADAPTATION OF LARGE LANGUAGE MODELS](https://arxiv.org/pdf/2106.09685.pdf)\n",
"- Prefix Tuning: [P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks](https://arxiv.org/pdf/2110.07602.pdf)\n",
"- P-Tuning: [GPT Understands, Too](https://arxiv.org/pdf/2103.10385.pdf)\n",
Expand All @@ -40,8 +39,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"If you are going to use Sagemaker in a local environment. You need access to an IAM Role with the required permissions for Sagemaker. You can find [here](https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-roles.html) more about it.\n",
"\n"
"如果您要在本地环境中使用 Sagemaker。 您需要访问具有 Sagemaker 所需权限的 IAM 角色。 您可以在 [此处](https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-roles.html) 找到更多相关信息。"
]
},
{
Expand Down Expand Up @@ -78,9 +76,9 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2. Load and prepare the dataset\n",
"## 2. 加载并准备数据集\n",
"\n",
"we will use the [samsum](https://huggingface.co/datasets/samsum) dataset, a collection of about 16k messenger-like conversations with summaries. Conversations were created and written down by linguists fluent in English.\n",
"我们将使用[samsum](https://huggingface.co/datasets/samsum)数据集,这是一个包含大约16k个带有摘要的类似信使的对话的集合。对话是由精通英语的语言学家创建并记录下来的。\n",
"\n",
"```python\n",
"{\n",
Expand All @@ -90,7 +88,7 @@
"}\n",
"```\n",
"\n",
"To load the `samsum` dataset, we use the `load_dataset()` method from the 🤗 Datasets library."
"要加载 `samsum` 数据集,我们使用 🤗Hugging Face Datasets 库中的 `load_dataset()` 方法。"
]
},
{
Expand All @@ -113,7 +111,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"To train our model, we need to convert our inputs (text) to token IDs. This is done by a 🤗 Transformers Tokenizer. If you are not sure what this means, check out **[chapter 6](https://huggingface.co/course/chapter6/1?fw=tf)** of the Hugging Face Course."
"为了训练我们的模型,我们需要将我们的输入(文本)转换为token ID。 这是由 🤗 Transformers Tokenizer 完成的。 如果您不确定这是什么意思,请查看抱抱脸课程的**[第 6 章](https://huggingface.co/course/chapter6/1?fw=tf)**"
]
},
{
Expand All @@ -136,10 +134,10 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Before we can start training, we need to preprocess our data. Abstractive Summarization is a text-generation task. Our model will take a text as input and generate a summary as output. We want to understand how long our input and output will take to batch our data efficiently.\n",
"在开始训练之前,我们需要预处理我们的数据。抽象摘要是一项文本生成任务。我们的模型将以文本作为输入,并生成摘要作为输出。我们想了解输入和输出需要多长时间才能有效地批处理数据。\n",
"\n",
"We defined a `prompt_template` which we will use to construct an instruct prompt for better performance of our model. Our `prompt_template` has a “fixed” start and end, and our document is in the middle. This means we need to ensure that the “fixed” template parts + document are not exceeding the max length of the model. \n",
"We preprocess our dataset before training and save it to disk to then upload it to S3. You could run this step on your local machine or a CPU and upload it to the [Hugging Face Hub](https://huggingface.co/docs/hub/datasets-overview)."
"我们定义了一个' prompt_template ',我们将使用它来构造一个指示提示符,以提高模型的性能。我们的' prompt_template '有一个固定的开始和结束,我们的文档在中间。这意味着我们需要确保“固定”模板部件+文档不超过模型的最大长度。\n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please reconsider translation of instruct prompt.

"我们在训练前对数据集进行预处理,并将其保存到磁盘上,然后上传到S3。你可以在你的本地机器或CPU上运行这个步骤,并将它上传到[Hugging Face Hub](https://huggingface.co/docs/hub/datasets-overview)"
]
},
{
Expand Down Expand Up @@ -214,7 +212,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"After we processed the datasets we are going to use the new [FileSystem integration](https://huggingface.co/docs/datasets/filesystems) to upload our dataset to S3. We are using the `sess.default_bucket()`, adjust this if you want to store the dataset in a different S3 bucket. We will use the S3 path later in our training script."
"处理完数据集后,我们将使用新的[文件系统整合](https://huggingface.co/docs/datasets/filesystems)将我们的数据集上传到S3。我们正在使用' sess.default_bucket() ',如果您想将数据集存储在不同的S3桶中,请调整此设置。我们将在后面的训练脚本中使用S3路径。"
]
},
{
Expand All @@ -236,15 +234,15 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## 3. Fine-Tune BLOOM with LoRA and bnb int-8 on Amazon SageMaker\n",
"## 3. 在 Amazon SageMaker 上使用 LoRA bnb int-8 微调 BLOOM\n",
"\n",
"In addition to the LoRA technique, we will use [bitsanbytes LLM.int8()](https://huggingface.co/blog/hf-bitsandbytes-integration) to quantize out frozen LLM to int8. This allows us to reduce the needed memory for BLOOMZ ~4x. \n",
"除了 LoRA 技术,我们还将使用 [bitsanbytes LLM.int8()](https://huggingface.co/blog/hf-bitsandbytes-integration) 通过Freeze方法将 LLM 量化为 int8。 这使我们能够将 BLOOMZ 所需的内存减少约 4 倍。\n",
"\n",
"We prepared a [run_clm.py](./scripts/run_clm.py), which implements uses PEFT to train our model. If you are interested in how this works check-out [Efficient Large Language Model training with LoRA and Hugging Face](https://www.philschmid.de/fine-tune-flan-t5-peft) blog, where we explain the training script in detail. T\n",
"我们准备了一个 [run_clm.py](./scripts/run_clm.py),它实现了使用 PEFT 来训练我们的模型。 如果您对其工作原理感兴趣,请查看 [使用 LoRA Hugging Face 进行高效大型语言模型训练](https://www.philschmid.de/fine-tune-flan-t5-peft) 博客,我们在其中详细的解释了训练脚本。\n",
"\n",
"\n",
"In order to create a sagemaker training job we need an `HuggingFace` Estimator. The Estimator handles end-to-end Amazon SageMaker training and deployment tasks. The Estimator manages the infrastructure use. \n",
"SagMaker takes care of starting and managing all the required ec2 instances for us, provides the correct huggingface container, uploads the provided scripts and downloads the data from our S3 bucket into the container at `/opt/ml/input/data`. Then, it starts the training job by running.\n",
"为了创建 sagemaker 训练作业,我们需要一个 HuggingFace 估计器。 Estimator 处理端到端的 Amazon SageMaker 训练和部署任务。 Estimator 管理基础设施的使用。\n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For estimator object, I think it should be okay to leave estimator untranslated.

"SagMaker 负责为我们启动和管理所有必需的 ec2 实例,提供正确的 huggingface 容器,上传提供的脚本并将数据从我们的 S3 存储桶下载到位于“/opt/ml/input/data”的容器中。 然后,它通过运行开始训练工作。\n",
"\n"
]
},
Expand Down Expand Up @@ -290,7 +288,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"We can now start our training job, with the `.fit()` method passing our S3 path to the training script."
"我们现在可以开始我们的训练工作,使用 .fit() 方法将我们的 S3 路径传递给训练脚本。"
]
},
{
Expand All @@ -311,19 +309,19 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"The trainign took `20632` seconds, which is about `5.7` hours. The `ml.g5.2xlarge` instance we used costs `$1.515` per hour. So the total cost for training BLOOMZ 7B was is `$8.63`. We could reduce the cost by using a spot instance, but the training time could increase, by waiting or restarts."
"训练用了“20632”秒,大约是“5.7”小时。 我们使用的“ml.g5.2xlarge”实例每小时收费“1.515 美元”。 因此,训练 BLOOMZ 7B 的总成本为 8.63 美元。 我们可以通过使用 spot 实例来降低成本,但是通过等待或重启可能会增加训练时间。"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please reconsider wording of by in this sentence. It feels more like giving reason or explanation.

]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## 4. Deploy the model to Amazon SageMaker Endpoint\n",
"## 4. 将模型部署到 Amazon SageMaker 端点\n",
"\n",
"When using `peft` for training, you normally end up with adapter weights. We added the `merge_and_unload()` method to merge the base model with the adatper to make it easier to deploy the model. Since we can now use the `pipelines` feature of the `transformers` library. \n",
"当使用 `peft` 进行训练时,您通常会得到适配器权重。 我们添加了 `merge_and_unload()` 方法来将基础模型与 adatper 合并,以便更轻松地部署模型。 因此我们现在可以使用 transformers 库的 pipelines 功能。\n",
"\n",
"We can now deploy our model using the `deploy()` on our HuggingFace estimator object, passing in our desired number of instances and instance type.\n"
"我们现在可以在我们的 HuggingFace 估计器对象上使用“deploy()”来部署我们的模型,传递我们想要的实例数量和实例类型。\n"
]
},
{
Expand Down Expand Up @@ -356,9 +354,9 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"SageMaker starts the deployment process by creating a SageMaker Endpoint Configuration and a SageMaker Endpoint. The Endpoint Configuration defines the model and the instance type.\n",
"SageMaker 通过创建 SageMaker 端点配置和 SageMaker 端点来启动部署过程。 端点配置定义模型和实例类型。\n",
"\n",
"Lets test by using a example from the `test` split."
"让我们使用“测试”拆分中的示例进行测试。"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please reconsider wording of split.

]
},
{
Expand Down Expand Up @@ -401,7 +399,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Lets compare it to the test result"
"让我们将其与测试结果进行比较"
]
},
{
Expand All @@ -419,7 +417,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Finally, we delete the endpoint again."
"最后,我们再次删除端点。"
]
},
{
Expand All @@ -432,13 +430,6 @@
"predictor.delete_model()\n",
"predictor.delete_endpoint()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
Expand Down