Skip to content

Commit 0a0a78c

Browse files
authored
update prophetnet examples (PaddlePaddle#5514)
1 parent 5931590 commit 0a0a78c

File tree

5 files changed

+233
-193
lines changed

5 files changed

+233
-193
lines changed

examples/text_summarization/prophetnet/README.md

+41-21
Original file line numberDiff line numberDiff line change
@@ -5,14 +5,14 @@
55
ProphetNet(先知网络)是一种新型的 seq2seq 预训练模型。在训练时,Prophetnet 每一时刻将会学习同时预测未来的 N 个字符,这种自监督学习目标可以使得模型考虑未来更远的字符,防止模型对强局部相关(strong
66
local correlation)过拟合。
77

8-
本项目是 Prophetnet 在 PaddlePaddle 2.2 上开源实现的文本摘要的例子,包含了在 CNN/DailyMail 数据集,Gigaword 数据集上微调和生成的代码。
8+
本项目是 Prophetnet 在 PaddlePaddle 2.4 上开源实现的文本摘要的例子,包含了在 CNN/DailyMail 数据集,Gigaword 数据集上微调和生成的代码。
99

1010
### 项目依赖
1111

1212
```
1313
pip install -r requirements.txt
14-
python -m pip install paddlepaddle-gpu==2.2.2.post112 -f https://www.paddlepaddle.org.cn/whl/linux/mkl/avx/stable.html
15-
pip install paddlenlp==2.2.3
14+
python -m pip install paddlepaddle-gpu==2.4.1.post117 -f https://www.paddlepaddle.org.cn/whl/linux/mkl/avx/stable.html
15+
pip install paddlenlp==2.5.2
1616
```
1717

1818
### 代码结构说明
@@ -64,54 +64,74 @@ bash run_train.sh <DATASET>
6464
- cnndm:
6565

6666
```
67-
python train_prophetnet.py \
67+
python -m paddle.distributed.launch --gpus 0 python train_prophetnet.py \
6868
--dataset=cnndm \
6969
--model_name_or_path=prophetnet-large-uncased \
70-
--batch_size=4 \
71-
--epochs=4 \
72-
--lr=0.0001 \
70+
--per_device_train_batch_size=4 \
71+
--per_device_eval_batch_size=8 \
72+
--num_train_epochs=4 \
73+
--learning_rate=0.0001 \
7374
--warmup_init_lr=1e-07 \
7475
--warmup_steps=1000 \
75-
--clip_norm=0.1 \
76-
--num_workers=4 \
76+
--max_grad_norm=0.1 \
77+
--dataloader_num_workers=4 \
78+
--logging_steps 10 \
79+
--save_steps 100 \
80+
--do_train \
81+
--do_eval \
7782
--output_dir=./ckpt/cnndm
7883
```
7984

8085
- gigaword:
8186

8287
```
83-
python train_prophetnet.py \
88+
python -m paddle.distributed.launch --gpus 0 python train_prophetnet.py \
8489
--dataset=gigaword \
8590
--model_name_or_path=prophetnet-large-uncased \
86-
--batch_size=16 \
87-
--epochs=6 \
88-
--lr=0.0001 \
91+
--per_device_train_batch_size=16 \
92+
--per_device_eval_batch_size=32 \
93+
--num_train_epochs=6 \
94+
--learning_rate=0.0001 \
8995
--warmup_init_lr=1e-07 \
9096
--warmup_steps=1000 \
91-
--clip_norm=0.1 \
92-
--num_workers=8 \
97+
--max_grad_norm=0.1 \
98+
--dataloader_num_workers=8 \
99+
--logging_steps 10 \
100+
--save_steps 100 \
101+
--do_train \
102+
--do_eval \
93103
--output_dir=./ckpt/gigaword
94104
```
95105

96106
其中参数释义如下:
97107

98108
- `dataset` 指定数据集,可选cnndm和gigaword
99109

100-
- `model_name_or_path` 预训练模型名称或本地预训练模型初始化权重文件路径。
110+
- `model_name_or_path` 预训练模型名称或本地预训练模型初始化权重文件路径
111+
112+
- `per_device_train_batch_size` 表示单卡训练样本批大小
101113

102-
- `batch_size` 表示训练样本批大小。
114+
- `per_device_eval_batch_size` 表示单卡验证样本批大小
103115

104-
- `epochs` 表示训练轮数
116+
- `num_train_epochs` 表示训练轮数
105117

106-
- `lr` 表示学习率
118+
- `learning_rate` 表示学习率
107119

108120
- `warmup_init_lr` 表示预热学习率
109121

110122
- `warmup_steps` 表示预热学习步数
111123

112-
- `clip_norm` 表示梯度裁剪
124+
- `max_grad_norm` 表示梯度裁剪
125+
126+
- `dataloader_num_workers` 指定数据加载规模
127+
128+
- `logging_steps` 表示打印结果间隔
129+
130+
- `save_steps`表示验证间隔
131+
132+
- `do_train` 表示是否训练
113133

114-
- `num_workers` 指定数据加载规模
134+
- `do_eval` 表示是否验证
115135

116136
- `output_idr` 指定微调结果权重存放路径
117137

Original file line numberDiff line numberDiff line change
@@ -1,29 +1,54 @@
11
#!/bin/bash
2+
3+
# Copyright (c) 2023 PaddlePaddle Authors. All Rights Reserved.
4+
#
5+
# Licensed under the Apache License, Version 2.0 (the "License");
6+
# you may not use this file except in compliance with the License.
7+
# You may obtain a copy of the License at
8+
#
9+
# http://www.apache.org/licenses/LICENSE-2.0
10+
#
11+
# Unless required by applicable law or agreed to in writing, software
12+
# distributed under the License is distributed on an "AS IS" BASIS,
13+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14+
# See the License for the specific language governing permissions and
15+
# limitations under the License.
16+
217
DATASET=$1
318

419
if [ "$DATASET" == cnndm ]
520
then
6-
python train_prophetnet.py \
21+
python -m paddle.distributed.launch --gpus 0 python train_prophetnet.py \
722
--dataset=cnndm \
823
--model_name_or_path=prophetnet-large-uncased \
9-
--batch_size=4 \
10-
--epochs=4 \
11-
--lr=0.0001 \
24+
--per_device_train_batch_size=4 \
25+
--per_device_eval_batch_size=8 \
26+
--num_train_epochs=4 \
27+
--learning_rate=0.0001 \
1228
--warmup_init_lr=1e-07 \
1329
--warmup_steps=1000 \
14-
--clip_norm=0.1 \
15-
--num_workers=4 \
30+
--max_grad_norm=0.1 \
31+
--dataloader_num_workers=4 \
32+
--logging_steps 10 \
33+
--save_steps 100 \
34+
--do_train \
35+
--do_eval \
1636
--output_dir=./ckpt/cnndm
1737
else
18-
python train_prophetnet.py \
38+
python -m paddle.distributed.launch --gpus 0 python train_prophetnet.py \
1939
--dataset=gigaword \
2040
--model_name_or_path=prophetnet-large-uncased \
21-
--batch_size=16 \
22-
--epochs=6 \
23-
--lr=0.0001 \
41+
--per_device_train_batch_size=16 \
42+
--per_device_eval_batch_size=32 \
43+
--num_train_epochs=6 \
44+
--learning_rate=0.0001 \
2445
--warmup_init_lr=1e-07 \
2546
--warmup_steps=1000 \
26-
--clip_norm=0.1 \
27-
--num_workers=8 \
47+
--max_grad_norm=0.1 \
48+
--dataloader_num_workers=8 \
49+
--logging_steps 10 \
50+
--save_steps 100 \
51+
--do_train \
52+
--do_eval \
2853
--output_dir=./ckpt/gigaword
2954
fi

0 commit comments

Comments
 (0)