forked from microsoft/mattersim
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
docs: add finetune docs (microsoft#78)
Co-authored-by: Xixian <[email protected]>
- Loading branch information
1 parent
bebbbd0
commit ee63939
Showing
4 changed files
with
2,040 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,86 @@ | ||
|
||
Finetune MatterSim | ||
================== | ||
|
||
Finetune Script | ||
--------------- | ||
|
||
MatterSim provides a finetune script to | ||
finetune the pre-trained MatterSim model on a custom dataset. | ||
You can find the script in the ``training`` folder or in the | ||
`github link <https://github.com/microsoft/mattersim/blob/main/src/mattersim/training/finetune_mattersim.py>`_. | ||
|
||
Finetune Parameters | ||
-------------------- | ||
|
||
The finetune script accepts several command-line arguments to customize the training process. Below is a list of the available parameters: | ||
|
||
- **run_name**: (str) The name of the run. Default is "example". | ||
|
||
- **train_data_path**: (str) Path to the training data file. Supports various file types readable by ASE (e.g., `.xyz`, `.traj`, `.cif`) and `.pkl` files. Default is "./sample.xyz". | ||
|
||
- **valid_data_path**: (str) Path to the validation data file. Default is None. | ||
|
||
- **load_model_path**: (str) Path to load the pre-trained model. Default is "mattersim-v1.0.0-1m". | ||
|
||
- **save_path**: (str) Path to save the trained model. Default is "./results". | ||
|
||
- **save_checkpoint**: (bool) Whether to save checkpoints during training. Default is False. | ||
|
||
- **ckpt_interval**: (int) Interval (in epochs) to save checkpoints. Default is 10. | ||
|
||
- **device**: (str) Device to use for training, either "cuda" or "cpu". Default is "cuda". | ||
|
||
- **cutoff**: (float) Cutoff radius for interactions. Default is 5.0. | ||
|
||
- **threebody_cutoff**: (float) Cutoff radius for three-body interactions, should be smaller than the two-body cutoff. Default is 4.0. | ||
|
||
- **epochs**: (int) Number of training epochs. Default is 1000. | ||
|
||
- **batch_size**: (int) Batch size for training. Default is 16. | ||
|
||
- **lr**: (float) Learning rate for the optimizer. Default is 2e-4. | ||
|
||
- **step_size**: (int) Step size for the learning rate scheduler. Default is 10. | ||
|
||
- **include_forces**: (bool) Whether to include forces in the training. Default is True. | ||
|
||
- **include_stresses**: (bool) Whether to include stresses in the training. Default is False. | ||
|
||
- **force_loss_ratio**: (float) Ratio of force loss in the total loss. Default is 1.0. | ||
|
||
- **stress_loss_ratio**: (float) Ratio of stress loss in the total loss. Default is 0.1. | ||
|
||
- **early_stop_patience**: (int) Patience for early stopping. Default is 10. | ||
|
||
- **seed**: (int) Random seed for reproducibility. Default is 42. | ||
|
||
- **re_normalize**: (bool) Whether to re-normalize energy and forces according to new data. Default is False. | ||
|
||
- **scale_key**: (str) Key for scaling forces. Only used when ``re_normalize`` is True. Default is "per_species_forces_rms". | ||
|
||
- **shift_key**: (str) Key for shifting energy. Only used when ``re_normalize`` is True. Default is "per_species_energy_mean_linear_reg". | ||
|
||
- **init_scale**: (float) Initial scale value. Only used when ``re_normalize`` is True. Default is None. | ||
|
||
- **init_shift**: (float) Initial shift value. Only used when ``re_normalize`` is True. Default is None. | ||
|
||
- **trainable_scale**: (bool) Whether the scale is trainable. Only used when ``re_normalize`` is True. Default is False. | ||
|
||
- **trainable_shift**: (bool) Whether the shift is trainable. Only used when ``re_normalize`` is True. Default is False. | ||
|
||
- **wandb**: (bool) Whether to use Weights & Biases for logging. Default is False. | ||
|
||
- **wandb_api_key**: (str) API key for Weights & Biases. Default is None. | ||
|
||
- **wandb_project**: (str) Project name for Weights & Biases. Default is "wandb_test". | ||
|
||
These parameters allow you to customize the finetuning process to suit your specific dataset and computational resources. | ||
|
||
Finetune Example | ||
---------------- | ||
You can replace the data path with your own data path. | ||
|
||
.. code-block:: bash | ||
torchrun --nproc_per_node=1 src/mattersim/training/finetune_mattersim.py --load_model_path mattersim-v1.0.0-1m --train_data_path xyz_files/train.xyz --valid_data_path xyz_files/valid.xyz --batch_size 16 --lr 2e-4 --step_size 20 --epochs 200 --save_path ./finetune_result --save_checkpoint --ckpt_interval 20 --include_stresses --include_forces |
Oops, something went wrong.