Skip to content

Commit 99bf00c

Browse files
committed
Use rdkit for SSSR and RCs (bug fix + Python upgrade)
Currently we use `RingDecomposerLib` for finding the Smallest Set of Smallest Rings and getting the Relevant Cycles. This package does not support Python 3.10+ and is thus blocking further upgrades to RMG. @KnathanM in particular is looking to get RMG to Python 3.11 so as to add support for ChemProp v2. I believe we can just use RDKit to do these operations instead. The original paper mentions that the functionality was being moved upstream to RDKit. With the help of AI I've taken just a first pass at reimplementing, with the special note that: - I opted to use the Symmetric SSSR in place of the 'true' SSSR. This is because the latter is non-unique (see [RDKit's "The SSSR Problem"](https://www.rdkit.org/docs/GettingStartedInPython.html#the-sssr-problem)). This should actually resolve #2562 - I need to read more about the "Relevant Cycles" This PR will be a draft for now, as it is predicated on Python 3.9 already being available (which it nearly is in #2741)
1 parent 9639a70 commit 99bf00c

File tree

4 files changed

+49
-78
lines changed

4 files changed

+49
-78
lines changed

.github/workflows/CI.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -56,7 +56,7 @@ jobs:
5656
strategy:
5757
fail-fast: false
5858
matrix:
59-
python-version: ["3.9"]
59+
python-version: ["3.9", "3.10", "3.11", "3.12"]
6060
os: [macos-13, macos-latest, ubuntu-latest]
6161
include-rms: ["", "with RMS"]
6262
exclude:

environment.yml

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -83,7 +83,6 @@ dependencies:
8383
# bug in quantities, see:
8484
# https://github.com/ReactionMechanismGenerator/RMG-Py/pull/2694#issuecomment-2489286263
8585
- conda-forge::quantities !=0.16.0,!=0.16.1
86-
- conda-forge::ringdecomposerlib-python
8786

8887
# packages we maintain
8988
- rmg::pydas >=1.0.3

rmgpy/molecule/graph.pyx

Lines changed: 48 additions & 59 deletions
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@ are the components of a graph.
3535

3636
import itertools
3737

38-
import py_rdl
38+
from rdkit import Chem
3939

4040
from rmgpy.molecule.vf2 cimport VF2
4141

@@ -971,68 +971,57 @@ cdef class Graph(object):
971971

972972
cpdef list get_smallest_set_of_smallest_rings(self):
973973
"""
974-
Returns the smallest set of smallest rings as a list of lists.
975-
Uses RingDecomposerLib for ring perception.
976-
977-
Kolodzik, A.; Urbaczek, S.; Rarey, M.
978-
Unique Ring Families: A Chemically Meaningful Description
979-
of Molecular Ring Topologies.
980-
J. Chem. Inf. Model., 2012, 52 (8), pp 2013-2021
981-
982-
Flachsenberg, F.; Andresen, N.; Rarey, M.
983-
RingDecomposerLib: An Open-Source Implementation of
984-
Unique Ring Families and Other Cycle Bases.
985-
J. Chem. Inf. Model., 2017, 57 (2), pp 122-126
986-
"""
987-
cdef list sssr
988-
cdef object graph, data, cycle
989-
990-
graph = py_rdl.Graph.from_edges(
991-
self.get_all_edges(),
992-
_get_edge_vertex1,
993-
_get_edge_vertex2,
994-
)
995-
996-
data = py_rdl.wrapper.DataInternal(graph.get_nof_nodes(), graph.get_edges().keys())
997-
data.calculate()
998-
999-
sssr = []
1000-
for cycle in data.get_sssr():
1001-
sssr.append(self.sort_cyclic_vertices([graph.get_node_for_index(i) for i in cycle.nodes]))
1002-
974+
Returns the smallest set of smallest rings (SSSR) as a list of lists of atom indices.
975+
Uses RDKit's built-in ring perception (GetSymmSSSR).
976+
977+
References:
978+
Kolodzik, A.; Urbaczek, S.; Rarey, M.
979+
Unique Ring Families: A Chemically Meaningful Description
980+
of Molecular Ring Topologies.
981+
J. Chem. Inf. Model., 2012, 52 (8), pp 2013-2021
982+
983+
Flachsenberg, F.; Andresen, N.; Rarey, M.
984+
RingDecomposerLib: An Open-Source Implementation of
985+
Unique Ring Families and Other Cycle Bases.
986+
J. Chem. Inf. Model., 2017, 57 (2), pp 122-126
987+
"""
988+
cdef list sssr = []
989+
cdef object ring_info, ring
990+
# Get the symmetric SSSR using RDKit
991+
ring_info = Chem.GetSymmSSSR(self)
992+
for ring in ring_info:
993+
# Convert ring (tuple of atom indices) to sorted list
994+
sorted_ring = self.sort_cyclic_vertices(list(ring))
995+
sssr.append(sorted_ring)
1003996
return sssr
1004997

1005998
cpdef list get_relevant_cycles(self):
1006999
"""
1007-
Returns the set of relevant cycles as a list of lists.
1008-
Uses RingDecomposerLib for ring perception.
1009-
1010-
Kolodzik, A.; Urbaczek, S.; Rarey, M.
1011-
Unique Ring Families: A Chemically Meaningful Description
1012-
of Molecular Ring Topologies.
1013-
J. Chem. Inf. Model., 2012, 52 (8), pp 2013-2021
1014-
1015-
Flachsenberg, F.; Andresen, N.; Rarey, M.
1016-
RingDecomposerLib: An Open-Source Implementation of
1017-
Unique Ring Families and Other Cycle Bases.
1018-
J. Chem. Inf. Model., 2017, 57 (2), pp 122-126
1019-
"""
1020-
cdef list rc
1021-
cdef object graph, data, cycle
1022-
1023-
graph = py_rdl.Graph.from_edges(
1024-
self.get_all_edges(),
1025-
_get_edge_vertex1,
1026-
_get_edge_vertex2,
1027-
)
1028-
1029-
data = py_rdl.wrapper.DataInternal(graph.get_nof_nodes(), graph.get_edges().keys())
1030-
data.calculate()
1031-
1032-
rc = []
1033-
for cycle in data.get_rcs():
1034-
rc.append(self.sort_cyclic_vertices([graph.get_node_for_index(i) for i in cycle.nodes]))
1035-
1000+
Returns the set of relevant cycles as a list of lists of atom indices.
1001+
Uses RDKit's RingInfo to approximate relevant cycles.
1002+
1003+
References:
1004+
Kolodzik, A.; Urbaczek, S.; Rarey, M.
1005+
Unique Ring Families: A Chemically Meaningful Description
1006+
of Molecular Ring Topologies.
1007+
J. Chem. Inf. Model., 2012, 52 (8), pp 2013-2021
1008+
1009+
Flachsenberg, F.; Andresen, N.; Rarey, M.
1010+
RingDecomposerLib: An Open-Source Implementation of
1011+
Unique Ring Families and Other Cycle Bases.
1012+
J. Chem. Inf. Model., 2017, 57 (2), pp 122-126
1013+
"""
1014+
cdef list rc = []
1015+
cdef object mol = self
1016+
cdef object ring_info = mol.GetRingInfo()
1017+
cdef object atom_rings = ring_info.AtomRings()
1018+
cdef object ring
1019+
for ring in atom_rings:
1020+
# Convert ring (tuple of atom indices) to sorted list
1021+
sorted_ring = self.sort_cyclic_vertices(list(ring))
1022+
# Filter for "relevant" cycles (e.g., rings up to size 7)
1023+
if len(sorted_ring) <= 7:
1024+
rc.append(sorted_ring)
10361025
return rc
10371026

10381027
cpdef list sort_cyclic_vertices(self, list vertices):

utilities.py

Lines changed: 0 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -50,7 +50,6 @@ def check_dependencies():
5050
missing = {
5151
'openbabel': _check_openbabel(),
5252
'pydqed': _check_pydqed(),
53-
'pyrdl': _check_pyrdl(),
5453
'rdkit': _check_rdkit(),
5554
'symmetry': _check_symmetry(),
5655
}
@@ -104,22 +103,6 @@ def _check_pydqed():
104103
return missing
105104

106105

107-
def _check_pyrdl():
108-
"""Check for pyrdl"""
109-
missing = False
110-
111-
try:
112-
import py_rdl
113-
except ImportError:
114-
print('{0:<30}{1}'.format('pyrdl', 'Not found. Necessary for ring perception algorithms.'))
115-
missing = True
116-
else:
117-
location = py_rdl.__file__
118-
print('{0:<30}{1}'.format('pyrdl', location))
119-
120-
return missing
121-
122-
123106
def _check_rdkit():
124107
"""Check for RDKit"""
125108
missing = False

0 commit comments

Comments
 (0)