-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
hippynn.pretraining doesn't have hierarchical_energy_initialization and problem with species data type. #124
Comments
Hi, thanks for getting in touch! For the first problem with hierarchical_energy_initialization, install the library from source. You can use the existing conda environment, download the repository, and then do For the second problem, you want to set the database inputs and targets using the output of Finally, as a small remark, depending on your programming perspective, it may also help to read the ani1x_training.py example which organizes a training code into separate pieces and demonstrates how these pieces interact, as well as a wider set of options that can be explored. Please let us know if these comments resolve your problems, thanks! |
Thanks for the help! I managed to run the training by reorganizing the inputs. I just have a follow up question. The training data I have is in a periodic domain. I have 2300 configurations and for each configuration I have different cell size. This means I have a |
Ah, you found a typo in our documentation! We will fix that. You have the right shape already, it should say (n_sys,3,3): we will fix this. Then you can instantiate a model like in this example By the way, if your system does not have a symmetric cell matrix, then please pay attention to the convention in that same paragraph describing which version of the cell we use. If you get crazy results, it could be that the cell has the wrong transposition. |
It looks like you are using hyperparameters from barebones.py where it is fit to the QM7 dataset, which was given in Bohr rather than Angstroms. So you are specifying a minimum distance sensitivity of 1.7 which is to large for data in Angstroms. Hippynn is unit-transparent, so any dimensionful hyperparameters will operate with respect to the units in your dataset. A baseline general set of sensitivity hyperparameters in Angstroms is here . If you are looking to model more near-equilibrium phenomena you can maybe get away with a larger lower cutoff, like 0.85 Angtrom. Neural net potentials will typically go crazy outside of their training data - for example if some high-energy system is constructed and atoms come very close, like 0.5 A away from each other - most datasets do not cover this kind of regime. HIP-NN neural networks are designed not to pass messages below some threshold distance, and instead the package will just warn you that your system contains things that are too close to reasonably expect the NN to produce reasonable answers. |
Thanks for all the help! |
You're welcome! |
Hi,

I am having two issues while trying to train hippynn. I used the following command to install hippynn in my conda environment
conda install -c conda-forge hippynn
However, when trying to run
from hippynn.pretraining import hierarchical_energy_initialization
, I am getting the following errorI checked the pretraining.py file and there is no hierarchical_energy_initialization in it. I haven't installed the dependencies from conda_environment.txt .
My next issue is with training hippynn with numpy arrays as dataset. I have DFT data of water molecules with position, species and energy. I tried converting the numpy arrays into database using

hippynn.databases.Database
. But, when I tried training the model, I am getting the following errorMy species data,

db_name = Z
, is in int32 data type, as shown hereHere is the code for creating the database
r = read_coords(data_path)
E = get_energy(data_path)
atom_type = np.ones((E.shape[0],Natoms),dtype=np.int32)
atom_type[:,:8] = 8
I have total 24 atoms (8 molecules) and 2300 configurations. r is the position array of shape (2300 x 72), E is the energy array of shape (2300) and atom_type is the atomic number array of shape (2300 x 8).
After this I followed the steps from barebones.py and used the following lines to create nodes and losses
training_modules, db_info = hippynn.experiment.assemble_for_training(mse_energy, validation_losses)
After this I used 100 configurations to create the database.
From here I used the training code from barebones.py and ran into the data type issue.
Please let me know what I can do to resolve these issues.
The text was updated successfully, but these errors were encountered: