layout | title | parent | nav_order | toc | summary | permalink |
---|---|---|---|---|---|---|
default |
Protonation Site Calculation |
Examples and Guides |
5 |
false |
A guide to protonation/deprotonation sampling. |
/page/examples/example_5.html |
{: .no_toc }
{{ page.summary }} {: .fs-6 .fw-300 }
{: .no_toc .text-delta }
- TOC {:toc}
{% include important.html content="Due to technical reasons the atom order of the input coordinates for all protonation/deprotonation applications of CREST is always presorted. In the sorted structures all hydrogen atoms are written to the end of the file." %}
The protonation site screening is one of CREST's original workflows. In the following, it is demonstrated for the alanineglycine molecule from Example 1. {: .text-justify }
{% include image.html file="example-5-1.png" alt="Ala-Gly protonation" caption="Protonation of the alanineglycine molecule. The most stable protonated structure (GFN2-xTB, gasphase) is shown on the right." max-width=450 %}
Assuming the input coordinates are given as struc.xyz
, then the screening procedure for the default settings of GFN2-xTB in the gasphase can be initiated via
{: .text-justify }
command
{{ site.data.icons.codefile }} struc.xyz
{{ site.data.icons.checkfile }} output
{{ site.data.icons.checkfile }} protonated.xyz
C 2.081440 0.615100 -0.508430 C 2.742230 1.824030 -1.200820 N 4.117790 1.799870 -1.190410 C 4.943570 2.827040 -1.822060 C 6.440080 2.569360 -1.637600 O 7.351600 3.252270 -2.069090 N 0.610100 0.695090 -0.538780 O 2.095560 2.724940 -1.739670 O 6.705220 1.463410 -0.897460 H 0.303080 1.426060 0.103770 H 0.338420 1.050680 -1.460480 C 2.488753 -0.593400 -1.198448 H 2.416500 0.557400 0.532050 H 4.614100 1.081980 -0.670550 H 4.699850 3.794460 -1.373720 H 4.722890 2.844690 -2.894180 H 7.687400 1.448620 -0.860340 H 2.029201 -1.457008 -0.719999 H 2.170233 -0.542411 -2.238576 H 3.572730 -0.688405 -1.154998 {% endcapture %} {% include codecell.html content=struc_xyz %}
Cite work conducted with this code as
• P.Pracht, F.Bohle, S.Grimme, PCCP, 2020, 22, 7169-7192. • S.Grimme, JCTC, 2019, 15, 2847-2862.
and for works involving QCG as
• S.Spicher, C.Plett, P.Pracht, A.Hansen, S.Grimme, JCTC, 2022, 18 (5), 3174-3189.
with help from: C.Bannwarth, F.Bohle, S.Ehlert, S.Grimme, C.Plett, P.Pracht, S.Spicher
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Command line input:
crest struc.xyz --protonate
--protonate : automated protonation script __________________________________________ | | | automated protonation script | |__________________________________________| Universitaet Bonn, MCTC P.Pracht, Wed 28. Nov 13:11:52 CEST 2018
Cite as: P.Pracht, C.A.Bauer, S.Grimme JCC, 2017, 38, 2618–2631.
Input coordinate lines sorted: element old new C 1 1 C 2 2 N 3 3 C 4 4 C 5 5 O 6 6 N 7 7 O 8 8 O 9 9 C 12 10 H 10 11 H 11 12 H 13 13 H 14 14 H 15 15 H 16 16 H 17 17 H 18 18 H 19 19 H 20 20
LMO calculation ... done.
- crude pre-optimization
Optimizing all 13 structures from file 'protonate_0.xyz' ... 1 2 3 4 5 6 7 8 9 10 11 12 13 done. 12 structures remain within 90.00 kcal/mol window
- loose optimization
Optimizing all 12 structures from file 'protonate_1.xyz' ... 1 2 3 4 5 6 7 8 9 10 11 12 done. 12 structures remain within 60.00 kcal/mol window
- optimization with user-defined thresholds
Optimizing all 12 structures from file 'protonate_2.xyz' ... 1 2 3 4 5 6 7 8 9 10 11 12 done. 9 structures remain within 30.00 kcal/mol window
=================================================== Identifying topologically equivalent structures: Equivalent to 2. structure: 7 structure(s). Done. Appending file 'protonated.xyz' with structures.
Initial 9 structures from file protonate_3.xyz have been reduced to 3 topologically unique structures.
=================================================== ============= ordered structure list ==============
written to file 'protonated.xyz'
structure ΔE(kcal/mol) Etot(Eh) 1 0.00 -33.953296 2 2.33 -33.949576 3 28.73 -33.907516
LMO calc. wall time : 0h : 0m : 0s
multilevel OPT wall time : 0h : 0m : 1s
Overall wall time : 0h : 0m : 1s
CREST terminated normally.
{% endcapture %} {% include codecell.html content=output_file %}
The production run in this example yields 3 possible structures of Ala-Gly+H+ at GFN2-xTB level in the gasphase.
Two of the structures have a relative energy within 3 kcal/mol. The third structure has a much larger relative energy of almost 30 kcal/mol.
These structures are written to the protonated.xyz
ensemble file.
Note, that the atom order of hydrogen atoms does change in between the structures and as such the ensemble does not formally match the ensemble file format, but due to the same atom type CREST can handle the file nonetheless.
{: .text-justify }
It is possible to add other ions besides H+ with the --swel <symbol>
command (short for "sw
itch el
ement").
To do so, simply specify the element symbol and the charge as <symbol>
argument, for example
{: .text-justify }
crest struc.xyz --protonate --swel ca2+
to add Ca2+ instead of H+. There is basically no limitation to which element/charge combination can be added with this command. However, adding polyatomic ions is currently not possible. {: .text-justify }
The deprotonation site screening is a very simple process. One-by-one, protons are removed from a given input structure and the resulting input geometries are optimized. In the following, it is demonstrated for the alanineglycine molecule from above {{site.data.icons.aup}}. {: .text-justify }
{% include image.html file="example-5-2.png" alt="Ala-Gly deprotonation" caption="Deprotonation of the alanineglycine molecule. The most stable deprotonated structure (GFN2-xTB, gasphase) is shown on the right." max-width=450 %}
Assuming the input coordinates are given as struc.xyz
, then the screening procedure for the default settings of GFN2-xTB in the gasphase can be initiated via
{: .text-justify }
command
{{ site.data.icons.codefile }} struc.xyz
{{ site.data.icons.checkfile }} output
{{ site.data.icons.checkfile }} deprotonated.xyz
C 2.081440 0.615100 -0.508430 C 2.742230 1.824030 -1.200820 N 4.117790 1.799870 -1.190410 C 4.943570 2.827040 -1.822060 C 6.440080 2.569360 -1.637600 O 7.351600 3.252270 -2.069090 N 0.610100 0.695090 -0.538780 O 2.095560 2.724940 -1.739670 O 6.705220 1.463410 -0.897460 H 0.303080 1.426060 0.103770 H 0.338420 1.050680 -1.460480 C 2.488753 -0.593400 -1.198448 H 2.416500 0.557400 0.532050 H 4.614100 1.081980 -0.670550 H 4.699850 3.794460 -1.373720 H 4.722890 2.844690 -2.894180 H 7.687400 1.448620 -0.860340 H 2.029201 -1.457008 -0.719999 H 2.170233 -0.542411 -2.238576 H 3.572730 -0.688405 -1.154998 {% endcapture %} {% include codecell.html content=struc_xyz %}
==============================================
| |
| C R E S T |
| |
| Conformer-Rotamer Ensemble Sampling Tool |
| based on the GFN methods |
| P.Pracht, S.Grimme |
| Universitaet Bonn, MCTC |
==============================================
Version 2.12, Thu 19. Mai 16:32:32 CEST 2022
Using the xTB program. Compatible with xTB version 6.4.0
Cite work conducted with this code as
• P.Pracht, F.Bohle, S.Grimme, PCCP, 2020, 22, 7169-7192. • S.Grimme, JCTC, 2019, 15, 2847-2862.
and for works involving QCG as
• S.Spicher, C.Plett, P.Pracht, A.Hansen, S.Grimme, JCTC, 2022, 18 (5), 3174-3189.
with help from: C.Bannwarth, F.Bohle, S.Ehlert, S.Grimme, C.Plett, P.Pracht, S.Spicher
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Command line input:
crest struc.xyz --deprotonate
--deprotonate : automated deprotonation script __________________________________________ | | | automated deprotonation script | |__________________________________________| Universitaet Bonn, MCTC P.Pracht, Wed 28. Nov 13:11:52 CEST 2018
Input coordinate lines sorted: element old new C 1 1 C 2 2 N 3 3 C 4 4 C 5 5 O 6 6 N 7 7 O 8 8 O 9 9 C 12 10 H 10 11 H 11 12 H 13 13 H 14 14 H 15 15 H 16 16 H 17 17 H 18 18 H 19 19 H 20 20
- crude pre-optimization
Optimizing all 10 structures from file "deprotonate_0.xyz" ... 1 2 3 4 5 6 7 8 9 10 done. 9 structures remain within 90.00 kcal/mol window
- loose optimization
Optimizing all 9 structures from file "deprotonate_1.xyz" ... 1 2 3 4 5 6 7 8 9 done. 7 structures remain within 60.00 kcal/mol window
- optimization with user-defined thresholds
Optimizing all 7 structures from file "deprotonate_2.xyz" ... 1 2 3 4 5 6 7 done. 5 structures remain within 30.00 kcal/mol window
=================================================== Identifying topologically equivalent structures: Equivalent to 2. structure: 2 structure(s). Equivalent to 4. structure: 2 structure(s). Done. Appending file 'deprotonated.xyz' with structures.
Initial 5 structures from file deprotonate_3.xyz have been reduced to 3 topologically unique structures.
=================================================== ============= ordered structure list ==============
written to file 'deprotonated.xyz'
structure ΔE(kcal/mol) Etot(Eh) 1 0.00 -33.597012 2 24.18 -33.558475 3 24.44 -33.558057
INPUT generation wall time : 0h : 0m : 0s
multilevel OPT wall time : 0h : 0m : 1s
Overall wall time : 0h : 0m : 1s
CREST terminated normally.
{% endcapture %} {% include codecell.html content=output_file %}
The deprotonation yields 3 possible structures of Ala-Gly- at GFN2-xTB level in the gasphase.
Only one of these structures is favorable, while the other two isomers have relative energies at about 24 kcal/mol.
These structures are written to the deprotonated.xyz
ensemble file.
Note, that like for the protonation procedure, the atom order of hydrogen atoms does change in between the structures and as such the ensemble does not formally match the ensemble file format.
{: .text-justify }
The third application belonging to the protonation/deprotonation procedures described in this example is the screening for prototropic tautomers. This can be done simply by consecutive execution of protonation and deprotonation site screening for a molecule. In the following, it is demonstrated for the alanineglycine molecule from above {{site.data.icons.aup}}. {: .text-justify }
Assuming the input coordinates are given as struc.xyz
, then the screening procedure for the default settings of GFN2-xTB can be initiated as before.
However, since we expect zwitter ions as the possible tautomers of Ala-Gly, we also use ALPB implicit solvation (for water) in this example.
{: .text-justify }
command
{{ site.data.icons.codefile }} struc.xyz
{{ site.data.icons.checkfile }} output
C 2.081440 0.615100 -0.508430 C 2.742230 1.824030 -1.200820 N 4.117790 1.799870 -1.190410 C 4.943570 2.827040 -1.822060 C 6.440080 2.569360 -1.637600 O 7.351600 3.252270 -2.069090 N 0.610100 0.695090 -0.538780 O 2.095560 2.724940 -1.739670 O 6.705220 1.463410 -0.897460 H 0.303080 1.426060 0.103770 H 0.338420 1.050680 -1.460480 C 2.488753 -0.593400 -1.198448 H 2.416500 0.557400 0.532050 H 4.614100 1.081980 -0.670550 H 4.699850 3.794460 -1.373720 H 4.722890 2.844690 -2.894180 H 7.687400 1.448620 -0.860340 H 2.029201 -1.457008 -0.719999 H 2.170233 -0.542411 -2.238576 H 3.572730 -0.688405 -1.154998 {% endcapture %} {% include codecell.html content=struc_xyz %}
==============================================
| |
| C R E S T |
| |
| Conformer-Rotamer Ensemble Sampling Tool |
| based on the GFN methods |
| P.Pracht, S.Grimme |
| Universitaet Bonn, MCTC |
==============================================
Version 2.12, Thu 19. Mai 16:32:32 CEST 2022
Using the xTB program. Compatible with xTB version 6.4.0
Cite work conducted with this code as
• P.Pracht, F.Bohle, S.Grimme, PCCP, 2020, 22, 7169-7192. • S.Grimme, JCTC, 2019, 15, 2847-2862.
and for works involving QCG as
• S.Spicher, C.Plett, P.Pracht, A.Hansen, S.Grimme, JCTC, 2022, 18 (5), 3174-3189.
with help from: C.Bannwarth, F.Bohle, S.Ehlert, S.Grimme, C.Plett, P.Pracht, S.Spicher
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Command line input:
crest struc.xyz --tautomerize --alpb water
--tautomerize : automated tautomerization script --alpb water : implicit solvation __________________________________________ | | | automated tautomerization script | |__________________________________________| Universitaet Bonn, MCTC P.Pracht, Wed 28. Nov 13:11:52 CEST 2018
Cite as: P.Pracht, R.Wilcken, A.Udvarhelyi, S.Rodde, S.Grimme JCAMD, 2018, 32, 1139-1149.
Input coordinate lines sorted: element old new C 1 1 C 2 2 N 3 3 C 4 4 C 5 5 O 6 6 N 7 7 O 8 8 O 9 9 C 12 10 H 10 11 H 11 12 H 13 13 H 14 14 H 15 15 H 16 16 H 17 17 H 18 18 H 19 19 H 20 20
** P R O T O N A T I O N C Y C L E 1 of 2 **
[....]
** D E P R O T O N A T I O N C Y C L E 1 of 2 **
[....]
** P R O T O N A T I O N C Y C L E 2 of 2 **
Calculating LMOs for all structures in file 'tautomerize_1.xyz' 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Collecting generated protomers ... done.
- crude pre-optimization
Optimizing all 178 structures from file "protomers.xyz" ... done. Structures sorted out due to dissociation: 12 160 structures remain within 60.00 kcal/mol window
- loose optimization
Optimizing all 160 structures from file "protonate_0.xyz" ... done. Structures sorted out due to dissociation: 1 99 structures remain within 30.00 kcal/mol window
=================================================== Identifying topologically equivalent structures: Equivalent to 1. structure: 5 structure(s). Equivalent to 2. structure: 14 structure(s). [....] Done. Appending file 'protonated.xyz' with structures.
Initial 99 structures from file protonate_1.xyz have been reduced to 25 topologically unique structures.
written to file 'protonated.xyz'
structure ΔE(kcal/mol) Etot(Eh) 1 0.00 -34.071561 2 0.19 -34.071253 3 0.85 -34.070211 [....] 24 28.60 -34.025982 25 29.34 -34.024801
** D E P R O T O N A T I O N C Y C L E 2 of 2 **
- crude pre-optimization
Optimizing all 275 structures from file "deprotonate_0.xyz" ... done. Structures sorted out due to dissociation: 19 217 structures remain within 60.00 kcal/mol window
- loose optimization
Optimizing all 217 structures from file "deprotonate_1.xyz" ... done. 167 structures remain within 30.00 kcal/mol window
=================================================== Identifying topologically equivalent structures: Equivalent to 1. structure: 5 structure(s). Equivalent to 3. structure: 6 structure(s). [...] Equivalent to 158. structure: 3 structure(s). Done. Appending file 'deprotonated.xyz' with structures.
Initial 167 structures from file deprotonate_2.xyz have been reduced to 69 topologically unique structures.
written to file 'deprotonated.xyz'
structure ΔE(kcal/mol) Etot(Eh) 1 0.00 -33.883352 2 0.16 -33.883094 3 0.21 -33.883010 4 0.36 -33.882783 5 1.12 -33.881568 [....] 68 29.46 -33.836411 69 29.79 -33.835885
** T A U T O M E R I Z E **
Optimizing all 69 structures from file "tautomerize_3.xyz" ... done. 68 structures remain within 30.00 kcal/mol window
=================================================== Identifying topologically equivalent structures: Equivalent to 1. structure: 2 structure(s). Equivalent to 3. structure: 2 structure(s). Done. Appending file 'tautomers.xyz' with structures.
Initial 68 structures from file tautomerize_4.xyz have been reduced to 66 topologically unique structures.
=================================================== ============= ordered structure list ==============
written to file 'tautomers.xyz'
structure ΔE(kcal/mol) Etot(Eh) 1 0.00 -33.883749 2 0.00 -33.883749 3 0.21 -33.883408 4 0.23 -33.883384 5 0.52 -33.882926 [....] 65 29.56 -33.836645 66 29.61 -33.836570
LMO calc. wall time : 0h : 0m : 0s
multilevel OPT wall time : 0h : 0m :31s
Overall wall time : 0h : 0m :31s
CREST terminated normally.
{% endcapture %} {% include codecell.html content=output_file %}
In general, a much larger variety of structures is obtained than for the standalone protonation or deprotonation procedure.
For Ala-Gly, as expected the zwitter ion is the most stable tautomer in implicit solvation.
Other tautomers, such as the (S)-isomer to the input (R)-structure are also generated.
However, since many chemical changes can occur in the protonation/deprotonation sequence,
also many artifacts at the respective level of theory are created.
When evaluating the output file (tautomers.xyz
), the user must decide by his/her chemical intuition whether to consider a generated tautomer for further investigation or not.
{: .text-justify }
{% include image.html file="example-5-3.png" alt="Ala-Gly tautomers" caption="Some tautomers of the alanineglycine molecule (GFN2-xTB, ALPB water)." max-width=600 %}
Note, that the protonation/deprotonation sequence is performed twice for the tautomer screening. This is because a single sequence swaps one proton position relative to the input structure. Swapping more than one proton position squentially can therefore be achieved by multiple executions of the protonation/deprotonation. {: .text-justify }