Active-Learning Assisted Framework for Efficient Parameterization of Force Fields

This repository contains code and documentation for the paper:

Active-Learning Assisted General Framework for Efficient Parameterization of Force Fields submitted to the Journal of Chemical Theory and Computation.

Authors: Yati¹, Yash Kokane², and Anirban Mondal^1,*
¹Department of Chemistry, Indian Institute of Technology Gandhinagar, Gujarat, 382355, India
²Department of Materials Engineering, Indian Institute of Technology Gandhinagar, Gujarat, 382355, India
Email: [email protected]

Introduction

This repository provides the implementation of an efficient approach to optimizing Lennard-Jones (LJ) force field parameters for sulfone molecules. The framework combines Genetic Algorithms (GA) with Gaussian Process Regression (GPR) to significantly reduce the computational expense associated with traditional force field parameterization methods.

The code is designed to:

Start with initial LJ parameter guesses near the OPLS (Optimized Potentials for Liquid Simulations) values.
Use GROMACS molecular dynamics simulations to evaluate the fitness of these parameters by comparing simulated densities and radial distribution functions (RDFs) to reference data obtained from Ab Initio Molecular Dynamics (AIMD) simulations.
Train a GPR model on the evaluated parameters and their fitness values.
Use a GA to predict new parameter sets that minimize the fitness function.
Iterate this process to efficiently converge on optimized force field parameters.

Prerequisites

To run this repository, the following prerequisites are required:

System Requirements

Operating System: Linux or macOS (Windows with WSL is supported but not tested extensively).
Python Version: Python 3.7 or higher.
GROMACS: GROMACS 2022.4 was used during development. Compatibility with other versions has not been evaluated. Please ensure GROMACS is installed separately.

Python Dependencies

The following Python libraries are required and are listed in requirements.txt:

numpy
pandas
scikit-learn
deap
argparse

You can install these dependencies using the command:

pip install -r requirements.txt

Installation

Clone the Repository

git clone https://github.com/cocokane/LJ_paramopt_framework.git
cd LJ_paramopt_framework

Install Python Dependencies

It is recommended to use a virtual environment:

python3 -m venv venv
source venv/bin/activate

Install required Python packages:

pip install -r requirements.txt

GROMACS Installation

Ensure that GROMACS 2022.4 is installed and properly configured in your environment. Detailed installation instructions can be found on the GROMACS website.

Usage

Input Data

The reference RDF data for OO and SS pairs from AIMD simulations must be stored in the reference/ directory. The initial parameter guesses and fitness data are stored in reference/MD_data.csv.

Required File Structure

├── README.md                   # Project documentation
├── main.py                     # Main Python script
├── requirements.txt            # Python dependencies
├── reference/
│   ├── OO_target.out           # Reference OO RDF data from AIMD
│   ├── SS_target.out           # Reference SS RDF data from AIMD
│   └── MD_data.csv             # Initial parameter guesses and fitness data
├── output.txt                  # Generated LJ parameters from GA-GPR
└── LICENSE                     # Project license

As the framework was developed primarily for sulfone molecules, the main script main.py is configured to optimize LJ parameters for OO and SS pairs. The script can be modified to optimize parameters for other atom pairs by changing the variables in the main.py script.

Ensure that the data files are correctly placed in the reference/ directory before running the main script.

To run the main script, use the following command:

python main.py

The script will execute the GA-GPR optimization process and output the predicted optimal LJ parameters to output.txt.

The next step involves performing MD simulations in GROMACS using the parameters stored in output.txt. Sample files, including the training dataset (MD_data.csv), AIMD reference RDFs (OO_target.out and SS_target.out), and the output file (output.txt), are provided as an example to familiarize users with the expected data structure.

Steps to run the framework

Use OPLS non bonded paremeters to generate new 200 parameters within ±5% of deviation
Perform classical MD on 200 new parameters to extract density and required RDFs and create MD_data.csv
Run main.py to generate output.txt, which contains optimized LJ parameters. This script uses MD_data.csv to train the GPR model and predict new parameters, and completes one iteration of the optimization process.
Best parameters from this iteration are stored in output.txt, perform classical MD using GROMACS for these parameters.
Based on your accuracy requirements, either finish or repeat step 3 and 4 by updating MD_data.csv until desired accuracy is achieved.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Active-Learning Assisted Framework for Efficient Parameterization of Force Fields

Table of Contents

Introduction

Prerequisites

System Requirements

Python Dependencies

Installation

Clone the Repository

Install Python Dependencies

Install required Python packages:

GROMACS Installation

Usage

Input Data

Required File Structure

Steps to run the framework

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
reference		reference
README.md		README.md
Workflow.png		Workflow.png
main.py		main.py
output.txt		output.txt
requirements.txt		requirements.txt

cocokane/LJ_paramopt_framework

Folders and files

Latest commit

History

Repository files navigation

Active-Learning Assisted Framework for Efficient Parameterization of Force Fields

Table of Contents

Introduction

Prerequisites

System Requirements

Python Dependencies

Installation

Clone the Repository

Install Python Dependencies

Install required Python packages:

GROMACS Installation

Usage

Input Data

Required File Structure

Steps to run the framework

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages