-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathtraining.log
20 lines (20 loc) · 1.37 KB
/
training.log
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
2025-01-23 19:41:47,171 - INFO - Using device: cuda
Config: {'n_layer': 4, 'n_head': 4, 'd_model': 256, 'd_ff': 1024, 'max_seq_len': 128, 'batch_size': 8, 'lr': 0.0002, 'weight_decay': 0.1, 'epochs': 150, 'warmup_steps': 100, 'grad_clip': 1.0, 'dataset_name': 'wikitext', 'dataset_config': 'wikitext-2-raw-v1', 'subset_size': 500, 'temperature': 0.7, 'top_k': 50}
2025-01-23 19:41:51,866 - INFO - Model parameters: 28,974,161
2025-01-23 19:42:18,565 - INFO - Epoch 10/150 completed
2025-01-23 19:42:43,540 - INFO - Epoch 20/150 completed
2025-01-23 19:43:10,826 - INFO - Epoch 30/150 completed
2025-01-23 19:43:36,064 - INFO - Epoch 40/150 completed
2025-01-23 19:44:01,270 - INFO - Epoch 50/150 completed
2025-01-23 19:44:26,808 - INFO - Epoch 60/150 completed
2025-01-23 19:44:52,371 - INFO - Epoch 70/150 completed
2025-01-23 19:45:17,822 - INFO - Epoch 80/150 completed
2025-01-23 19:45:43,304 - INFO - Epoch 90/150 completed
2025-01-23 19:46:08,854 - INFO - Epoch 100/150 completed
2025-01-23 19:46:34,507 - INFO - Epoch 110/150 completed
2025-01-23 19:47:00,169 - INFO - Epoch 120/150 completed
2025-01-23 19:47:25,793 - INFO - Epoch 130/150 completed
2025-01-23 19:47:51,382 - INFO - Epoch 140/150 completed
2025-01-23 19:48:17,770 - INFO - Epoch 150/150 completed
2025-01-23 19:48:18,691 - INFO - Training loss plot saved to training_loss.png
2025-01-23 19:48:19,179 - INFO - Training completed!