Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A problem with cyclopropane proper torsion term #60

Open
drmeister opened this issue Jan 29, 2023 · 3 comments
Open

A problem with cyclopropane proper torsion term #60

drmeister opened this issue Jan 29, 2023 · 3 comments

Comments

@drmeister
Copy link

Is this proper torsion term malformed? Is it supposed to recognize a cyclopropane with an exocyclic atom?

https://github.com/openforcefield/openff-forcefields/blob/main/openforcefields/offxml/openff-2.0.0.offxml#L158

<Proper smirks="[#6X4;r3:1]-[#6X4;r3:2]-[#6X4;r3:3]-[*:4]" periodicity1="2" periodicity2="1" phase1="0.0 * degree" phase2="0.0 * degree" id="t16" k1="3.656359841852 * mole**-1 * kilocalorie" k2="3.357036752304 * mole**-1 * kilocalorie" idivf1="1.0" idivf2="1.0"></Proper>

I use the VF2 algorithm to recognize these SMIRKS strings, and VF2 fails on the pattern [#6X4;r3:1]-[#6X4;r3:2]-[#6X4;r3:3]-[*:4] within a cyclopropane because it only recognizes this in the context where the first atom and the third atom are NOT bonded to each other.

If this pattern is supposed to recognize a cyclopropane, a SMIRKS string for this that should not cause additional problems and would work with VF2 would be...
[#6X4;r3:1]1-[#6X4;r3:2]-[#6X4;r3:3]1-[*:4]

This adds ring closure numbers to the first and third atoms.

You could test if changing the pattern changes your parameter assignment by generating parameters with the parameter as it is and comparing it to a force field where you use the pattern above.

You use a pattern with ring-closure numbers in this angle term...

https://github.com/openforcefield/openff-forcefields/blob/main/openforcefields/offxml/openff-2.0.0.offxml#L103

@j-wags
Copy link
Member

j-wags commented Feb 1, 2023

Thanks for the detailed report, @drmeister.

Interesting - the two SMIRKS should have equivalent meanings in most cases. Though the new SMIRKS would lead to different parameter assignment in a small number of molecules. I'm not sure whether we had spiropentane in mind when defining this SMIRKS, but in case we do this would lead to some different parameters being assigned.

from openff.toolkit import Molecule, ForceField
mol = Molecule.from_smiles('C12(CC1)CC2')
mol.visualize(backend='rdkit')

image

# Count how many times the term gets assigned using the original SMIRKS
ff = ForceField('openff-2.0.0.offxml')
labels = ff.label_molecules(mol.to_topology())
print(len([(key, val) for key, val in labels[0]["ProperTorsions"].items() if val.id=="t16"]))

40

# Switch to @drmeister's proposed new SMIRKS (verifying the change to the FF along the way)
from openff.toolkit.utils.exceptions import ParameterLookupError
ff2 = ForceField('openff-2.0.0.offxml')
param = ff2["ProperTorsions"]["[#6X4;r3:1]-[#6X4;r3:2]-[#6X4;r3:3]-[*:4]"]
param.smirks = "[#6X4;r3:1]1-[#6X4;r3:2]-[#6X4;r3:3]1-[*:4]"
try:
    param = ff2["ProperTorsions"]["[#6X4;r3:1]-[#6X4;r3:2]-[#6X4;r3:3]-[*:4]"]
except ParameterLookupError:
    param = ff2["ProperTorsions"]["[#6X4;r3:1]1-[#6X4;r3:2]-[#6X4;r3:3]1-[*:4]"]
    print("Parameter smirks replacement verified")
# Count the number of times the modified term gets assigned
labels = ff2.label_molecules(mol.to_topology())
print(len([(key, val) for key, val in labels[0]["ProperTorsions"].items() if val.id=="t16"]))

Parameter smirks replacement verified
24

It'd take a bit more person-time to determine whether the change in parameter assignment leads to significant changes, and skimming the SMARTS and SMIRKS specifications I don't have cause to think that the current parameter SMIRKS is misdefined. So I see the benefit of re-defining this SMIRKS for users of the VF2 algorithm but it will probably take us a while to clear our "on fire" stack of force field to-dos and validate that this change doesn't have other unintended effects.

@davidlmobley @lilyminium - Do you have any thoughts?

@drmeister
Copy link
Author

Hey - thanks for getting back to me.

So you do not use an implementation of the VF2 algorithm to assign parameters?
Huh - I thought that the VF2 algorithm for graph isomorphism testing was the only way to do this.
Question 1: What do you use?
Question 2: What if you added another dihedral parameter with the "[#6X4;r3:1]1-[#6X4;r3:2]-[#6X4;r3:3]1-[*:4]" pattern? That's what I did to get past this problem. I created a second offxml file with a single dihedral parameter with that SMIRKS string and the parameters from the original dihedral parameter. That allowed me to proceed.

@lilyminium
Copy link
Collaborator

Hi @drmeister, glad to see you've been able to progress by adding a new parameter! We don't use the VF2 algorithm -- instead, we use the graph-matching backends of RDKit and OpenEye that return all possible results for a search query (rdkit.GetSubstructMatches and oechem.OESubSearch). We update the results iteratively with each SMARTS pattern, so later/lower ones have "priority" over earlier, general parameters. Because [#6X4;r3:1]-[#6X4;r3:2]-[#6X4;r3:3] matches all triplets in a three-membered ring whether 1 and 3 are bonded or not, assigning the parameters the "normal" way does actually get intended torsion:

from openff.toolkit import Molecule, ForceField
ff = ForceField("openff-2.0.0.offxml")
mol = Molecule.from_smiles("C1CC1")
parameters = ff.label_molecules(mol.to_topology())
print(parameters[0]["ProperTorsions"][(0, 1, 2, 7)])
<ProperTorsionType with smirks: [#6X4;r3:1]-[#6X4;r3:2]-[#6X4;r3:3]-[*:4]  periodicity1: 2  periodicity2: 1  phase1: 0.0 degree  phase2: 0.0 degree  id: t16  k1: 3.656359841852 kilocalorie / mole  k2: 3.357036752304 kilocalorie / mole  idivf1: 1.0  idivf2: 1.0  >

So on a technical point, I don't think that's a huge issue for the force field. On the scientific side, perhaps the torsions shouldn't be the same within a ring vs. across two rings, and adding a new parameter to the force field would be an improvement. As Jeff says, though, we'd want to take some time to validate things. If it turns out one parameter addresses them both well, I'd worry that near-redundant parameters could introduce noise during future force field optimisation, for not much gain. (Another very slight downside is that because of our searching strategy, it could slightly increase parameter assignment time).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants