-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make charge symmetrization stereo-aware #738
Conversation
symmetrize=True flag was already being passed to toolkit functions but there were still small asymmetries in some cases
timemachine/ff/handlers/utils.py
Outdated
from openeye import oechem | ||
|
||
oemol = convert_to_oe(rdmol) | ||
oechem.OEPerceiveSymmetry(oemol) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
is this stereo-aware? (i.e. a chiral graph may be asymmetric, but may be falsely perceived to be symmetric unless the generators of the automorphism group is aware of this)
-
bool OEPerceiveSymmetry(OEMolBase &, bool includeH=true, bool automorph=false), what does automorph do here? API docs unfortunately are quite sparse. @bp-kelley @badisa might know...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, not sure what the correct behavior is for that molecule. Enumerating all configurations around the double bonds
The symmetry classes assigned by this method always appear the same
The charges currently assigned (without applying this extra symmetrize step) vary by stereoisomer, but are already exactly or nearly constant within each symmetry class. Charges assigned to each stereoisomer
Deviation within each symmetry class for each stereoisomer
The effect of the change proposed in this PR on this molecule would be to eliminate the ~ |
The two fluorines should not be in the same symmetry class (for the stereoisomer depicted in the diagram). Considering all conformations that satisfy the depicted stereo, running AM1 calculations should've resulted in different charges and should not be symmetrized. Although the diff is very small [-2.082 vs -2.081] shown below, they should be different! [-2.082, 0.794, -2.22, -1.398, -1.398, -2.22, 0.794, -2.081, 1.686, 1.69, 1.53, 1.53, 1.69, 1.686] I suspect it may be because AM1 is not sensitive enough on the raw charges for the example I provided, maybe try this one, where the environmental differences around the Fs are more drastic: O/C(=C(/O)\C(\Cl)=C(\F)Cl)/C(/Cl)=C(\F)Cl |
Ahh, thanks for the more dramatic example! Enumerating all configurations around the double-bonds in the molecule
If we want the charges to be stereo asymmetric, I wonder if we should modify these lines timemachine/timemachine/ff/handlers/nonbonded.py Lines 89 to 91 in 451803e
|
Toggling timemachine/timemachine/ff/handlers/nonbonded.py Lines 89 to 91 in 451803e
O/C(=C(/O)\C(\Cl)=C(\F)Cl)/C(/Cl)=C(\F)Cl (as should be expected).
|
Co-Authored-By: Yutong Zhao <[email protected]>
Based on offline discussion with @proteneer -- looked into using the RDKit alternative https://www.rdkit.org/docs/source/rdkit.Chem.rdmolfiles.html#rdkit.Chem.rdmolfiles.CanonicalRankAtoms . This appears to have the desired behavior! The PR now:
The number of distinct partial charges is reduced on benzene (from 3 on master to 2 on a220996), and increased on |
From #738 (comment) Co-Authored-By: jkausrelay <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice! would be good sanity check that this doesn't drastically affect FreeSolv results but LGTM!
Checking effect on computed charges for FreeSolv molecules (ref: master (451803e) -> proposed: most recent PR commit (9ce6cb9)) Histogram of signed charge changes per atom (note log y scale) For ~99.2% of atoms, the effect on charge is < 0.001 e, and ~99.7% of atoms, the effect on charge is < 0.01 e. Histogram of unsigned charge changes per molecule (max over atoms within each molecule, note log x scale) Displaying the molecules where the difference exceeds arbitrary threshold of 0.01 e Interesting that these all have one or more nitrate groups... Haven't yet checked the effect of the proposed change on computed hydration free energies, and I'd be curious also to compare the assigned symmetry classes, esp. for the plotted molecules. |
Can this be merged? |
Can you try this (thanks @bp-kelley for the suggestion on ResonanceMolSupplier!): from rdkit import Chem
import numpy as np
def resonance_aware_ranking(mol):
flags = Chem.ResonanceFlags() | Chem.ResonanceFlags.ALLOW_CHARGE_SEPARATION
permutations = []
for m in Chem.ResonanceMolSupplier(mol, flags):
res = Chem.CanonicalRankAtoms(m, breakTies=False, includeChirality=True)
permutations.append(list(res))
permutations = np.array(permutations).T
# identify which group each orbit belongs to
iota = 0
kv = dict()
classes = []
for a_idx, orbit in enumerate(permutations):
canon_orbit = tuple(sorted(orbit))
if canon_orbit not in kv:
kv[canon_orbit] = iota
iota += 1
classes.append(kv[canon_orbit])
return classes
print(resonance_aware_rank(Chem.AddHs(Chem.MolFromSmiles("[O-]C(=O)C"))))
print(resonance_aware_rank(Chem.AddHs(Chem.MolFromSmiles("CO[N+]([O-])=O")))) |
Thanks @proteneer and @bp-kelley -- will try this out! (Just want to strengthen the test to assert that the difference between old and new symmetrization behavior is indeed limited to stereo-awareness.) |
Stale -- migrated to #817 |
symmetrize=True
to the AM1 routine (since this appears to average over stereo-oblivious atom classes, and occasionally introduces very small asymmetries in cases like benzene)n_atoms
distinct charges are expected, but ~n_atoms/2
were produced)Initial PR description:
symmetrize=True
flag was already being passed to toolkit functions but there were still small asymmetries in some cases (see e.g. #737 (comment) )Issue magnitude: very very small. (In 82 of 642 FreeSolv molecules, the partial charges within a symmetry class varied up to ~0.0002 e. However, it's still nice for
len(set(assigned_charges))
for benzene to be 2 rather than 3.)