Skip to content

Latest commit

 

History

History
761 lines (701 loc) · 40.8 KB

example_2.md

File metadata and controls

761 lines (701 loc) · 40.8 KB
layout title parent nav_order toc summary permalink
default
Ensemble Sorting
Examples and Guides
2
false
A guide to sorting ensembles.
/page/examples/example_2.html

{{page.title}}

{: .no_toc }

{{ page.summary }} {: .fs-6 .fw-300 }

Table of contents

{: .no_toc .text-delta }

  1. TOC {:toc}

Sorting Conformational Ensembles with CREGEN

{: .em-6 }

One of the core procedures of CREST, the sorting of ensembles based on the energy, rotational constants, and Cartesian RMSDs, is refered to as CREGEN and can be invoked as a standalone procedure. CREST is able to sort basically any ensemble file that satisfies the input specifications as given in the Input Formats section. {: .text-justify }

Assume in the following, that you are working with an ensemble for the alanineglycine from Example 1. As you remember, in the previous example the ensemble was generated with an implicit solvation potential for water. Assume that this was an error and you wanted the ensemble in the gasphase. After refining (optimizing) the GBSA(water) structure for the gasphase and stitching them together in a new ensemble file (here: input-ensemble.xyz), or by using the --mdopt function (Example: Ensemble Optimization), you might be left with an unorderd collection of structures as in the figure below. {: .text-justify }

{% include image.html file="example-2-1.png" alt="Ala-Gly MOLDEN screenshot" caption="MOLDEN screenshot of the unsorted Ala-Gly ensemble." max-width=500 %}

You can see that for this ensemble we have 20 structures in total but they are not in an ascending order with regards to their relative energies. Furthermore, some of the structures seem to be degenerate. The question is now: How to sort this ensemble? {: .text-justify }

The answer is given here:

{{ site.data.icons.code }} command {{ site.data.icons.codefile }} struc.xyz {{ site.data.icons.codefile }} input-ensemble.xyz {{ site.data.icons.checkfile }} output
{% include command.html cmd="crest struc.xyz --cregen input-ensemble.xyz" %} This is the command that needs to be executed from the command line. The `--cregen` command activates the ensemble sorting. The ensemble file (`input-ensemble.xyz`) is passed as an argument to the `--cregen` command. A CREST input structure (`struc.xyz`) has to be provided to get a reference topology. The output will look something like the one in the `output` tab above. Note again, that the ensemble must satisfy the format requiements from the [**Input Formats** section.]({{site.baseurl}}/page/documentation/coords.html#ensemble-and-trajectory-files)
{% capture struc_xyz %} 20

C 2.081440 0.615100 -0.508430 C 2.742230 1.824030 -1.200820 N 4.117790 1.799870 -1.190410 C 4.943570 2.827040 -1.822060 C 6.440080 2.569360 -1.637600 O 7.351600 3.252270 -2.069090 N 0.610100 0.695090 -0.538780 O 2.095560 2.724940 -1.739670 O 6.705220 1.463410 -0.897460 H 0.303080 1.426060 0.103770 H 0.338420 1.050680 -1.460480 C 2.488753 -0.593400 -1.198448 H 2.416500 0.557400 0.532050 H 4.614100 1.081980 -0.670550 H 4.699850 3.794460 -1.373720 H 4.722890 2.844690 -2.894180 H 7.687400 1.448620 -0.860340 H 2.029201 -1.457008 -0.719999 H 2.170233 -0.542411 -2.238576 H 3.572730 -0.688405 -1.154998 {% endcapture %} {% include codecell.html content=struc_xyz %}

{% capture struc_xyz %} 20 -33.86165671 C -2.2730610770 0.1204379121 0.2587326844 C -0.9540286313 -0.6593849471 0.3188226177 N 0.0417612113 0.0319398989 0.9170922194 C 1.4013427195 -0.4263317141 0.8768297077 C 2.2285158127 0.3824892208 -0.1040932435 O 1.8101914129 1.2894738222 -0.7698878586 N -1.9934073645 1.5510154031 0.3009220912 O -0.8328196279 -1.7982550975 -0.0790922210 O 3.4987650157 -0.0270770628 -0.1355606108 H -2.8592506267 2.0736885995 0.3771164510 H -1.5415781078 1.8360727560 -0.5640820639 C -3.0849322979 -0.3334409426 -0.9572372265 H -2.8248558387 -0.1306727078 1.1744464470 H -0.1400457636 1.0131089865 1.0870843862 H 1.3864410357 -1.4715483602 0.5478117885 H 1.8718080464 -0.3766800189 1.8640823887 H 4.0113840380 0.4921882195 -0.7758288770 H -4.0812094663 0.1042256481 -0.9410289372 H -2.5803076349 -0.0321120913 -1.8738937405 H -3.1688710126 -1.4161436261 -0.9539833566 20 -33.86102181 C -2.2533712145 0.0972477920 0.3524677726 C -0.9486533068 -0.6831810101 0.1887178128 N -0.0851255525 -0.0866258698 -0.6634183403 C 1.2696157893 -0.5416011649 -0.7960175233 C 2.2378669776 0.3785566551 -0.0773074674 O 1.9247543567 1.3675749054 0.5264560749 N -1.9881533783 1.5222287820 0.1912959402 O -0.7436074016 -1.7541716850 0.7188043488 O 3.5004992684 -0.0361848156 -0.2043350417 H -1.3987384767 1.8606745178 0.9454182962 H -2.8567266980 2.0453081606 0.2165326785 C -3.2316635691 -0.3676154711 -0.7276857377 H -2.6650334311 -0.1665201692 1.3412609905 H -0.3050817080 0.8709799044 -0.9086463393 H 1.5660428872 -0.6175510358 -1.8471833127 H 1.3280665711 -1.5373998577 -0.3424201058 H 4.1057716458 0.5554795202 0.2709981483 H -2.8489560580 -0.0997193745 -1.7091613963 H -3.3553510338 -1.4455631125 -0.6742119592 H -4.2017435483 0.1050733020 -0.5880694434 20 -33.86073072 C -2.2067909289 0.0717692945 0.4076167499 C -0.9526836976 -0.5155650227 -0.2501224158 N 0.0277895128 0.3923767998 -0.4432617267 C 1.3440112146 0.0032279766 -0.8655046549 C 2.3877864002 0.3434810611 0.1780817359 O 2.1732715638 0.9329475567 1.2008189437 N -2.2132477034 1.5223171609 0.2676380848 O -0.8570311732 -1.6814412507 -0.5715466899 O 3.6023270301 -0.0756311211 -0.1872298459 H -2.4183575923 1.7722272947 -0.6954735232 H -2.9415471957 1.9249734219 0.8471352649 C -3.4568721994 -0.6263466666 -0.1351424921 H -2.1091814034 -0.1507408872 1.4785502786 H -0.1005413276 1.3011271369 -0.0176180824 H 1.6255585526 0.4921305598 -1.8060633670 H 1.3329241149 -1.0797410588 -1.0315998765 H 4.2599605095 0.1480846436 0.4908652079 H -3.6095086966 -0.3587205358 -1.1796993938 H -3.3249846500 -1.7025124673 -0.0740298234 H -4.3378013561 -0.3396067322 0.4358328351 20 -33.86057141 C -2.3310261451 -0.0168879925 0.1487092548 C -1.0032289009 -0.7096756165 0.4542781253 N -0.0156190233 0.1574251065 0.7616418870 C 1.3576532049 -0.2509400325 0.8393976519 C 2.1808568212 0.3664433692 -0.2737397020 O 1.7590619760 1.1301045702 -1.0977567762 N -2.4068803868 1.2510050495 0.8647220741 O -0.8564774692 -1.9130145373 0.4141496371 O 3.4553387851 -0.0284699984 -0.2218958307 H -3.2166479815 1.7773607272 0.5546580200 H -2.5043631239 1.0895940516 1.8615834153 C -2.3650633990 0.2441733926 -1.3587350015 H -3.1404002061 -0.7195532207 0.4086029730 H -0.2319824217 1.1408662813 0.6686037017 H 1.3827325016 -1.3421301305 0.7434645268 H 1.8081822649 0.0228428526 1.8002815769 H 3.9675701457 0.3699456807 -0.9438759677 H -2.2386981664 -0.6905317017 -1.8979857770 H -3.3162129924 0.6890721761 -1.6444774534 H -1.5618283500 0.9239372693 -1.6330965926 20 -33.86165665 C -2.3079792196 0.0276093894 0.3125061794 C -0.9502972263 -0.6802559173 0.3978470086 N 0.0087414174 0.0834863485 0.9674855940 C 1.3903084837 -0.3046119936 0.9400407980 C 2.1726646110 0.5078826420 -0.0740674670 O 1.7067642768 1.3662400808 -0.7719694447 N -2.1024703831 1.4712721672 0.2997355703 O -0.7706189339 -1.8253696169 0.0422297724 O 3.4622932636 0.1633468017 -0.0952277899 H -1.6668989872 1.7462395166 -0.5767960189 H -2.9940268334 1.9511809221 0.3588026497 C -3.0967447083 -0.5132596492 -0.8828883514 H -2.8449441061 -0.2167201468 1.2388121859 H -0.2236635476 1.0596571374 1.1007720908 H 1.4295665214 -1.3610626781 0.6514245655 H 1.8589733976 -0.1924673074 1.9230186055 H 3.9459451489 0.6837491453 -0.7567745546 H -4.1144946720 -0.1278435476 -0.8796155558 H -2.6098989288 -0.2206120269 -1.8118935927 H -3.1240005399 -1.5979841496 -0.8388752628 20 -33.86000997 C -2.2637553557 0.1202735224 0.2811079939 C -0.9455539764 -0.6646841755 0.2713655914 N 0.0721424598 0.0084524330 0.8482925225 C 1.4159917504 -0.4933500319 0.8428840595 C 2.3451345979 0.2571932689 -0.0924136261 O 3.5356002388 0.1127780210 -0.1286826662 N -1.9797763914 1.5502881525 0.3417729298 O -0.8434603830 -1.7917340496 -0.1634908305 O 1.6922587743 1.1140335823 -0.8851822612 H -2.8388072118 2.0692197476 0.4892263932 H -1.5905974094 1.8556560488 -0.5462029679 C -3.1304010296 -0.3055407848 -0.9061301374 H -2.7747595249 -0.1496377473 1.2148249847 H -0.1065114271 0.9720374789 1.0962458966 H 1.3655403015 -1.5346577249 0.4994402364 H 1.8550253134 -0.4835997278 1.8460194649 H 2.3144875970 1.5540538999 -1.4857289397 H -2.6674129922 0.0139640300 -1.8385453040 H -3.2161259077 -1.3879386863 -0.9219716020 H -4.1241759014 0.1324525673 -0.8356817494 20 -33.86073012 C -2.2296316197 -0.2607678790 0.0657109965 C -0.9147910093 0.0602702906 -0.6543210201 N 0.0230524813 0.5981249095 0.1546311320 C 1.3779827681 0.7860594612 -0.2824259352 C 2.3470549590 -0.0467135950 0.5311859687 O 2.0492141181 -0.7246933803 1.4751611795 N -2.3088514278 0.4802675735 1.3180381284 O -0.7402136520 -0.1430250193 -1.8374805840 O 3.5991691874 0.0793951849 0.0833257674 H -3.0856902397 0.1413005109 1.8748220487 H -2.4798978645 1.4627226542 1.1240551219 C -3.4118084663 -0.0389859422 -0.8818411000 H -2.1718848250 -1.3269522696 0.3223700985 H -0.1752977545 0.5982290672 1.1466581065 H 1.6817819336 1.8373513282 -0.2095646826 H 1.4321761700 0.4850168552 -1.3345509177 H 4.2079979114 -0.4604617222 0.6126987411 H -4.3338981311 -0.4153618316 -0.4432372321 H -3.5273879599 1.0234158006 -1.0915579799 H -3.2248425909 -0.5538728350 -1.8195039397 20 -33.86073104 C -2.2338781151 -0.2590779091 0.0204582920 C -0.9104442307 0.1129982376 -0.6580795482 N 0.0218369146 0.5772197482 0.2011687680 C 1.3806938710 0.7986475573 -0.2070216745 C 2.3434805451 -0.0887756527 0.5542858960 O 2.0376760303 -0.8353514986 1.4423374986 N -2.3210381739 0.3745408364 1.3299079701 O -0.7252846567 0.0073919264 -1.8523562662 O 3.6001473907 0.0743445881 0.1318193103 H -2.4842206932 1.3709833937 1.2176302766 H -3.1054748884 -0.0046163891 1.8488180192 C -3.4054101192 0.0499987881 -0.9158377478 H -2.1851935375 -1.3434370322 0.1869045839 H -0.1854754428 0.4968967436 1.1880936978 H 1.6812650520 1.8424841835 -0.0549372944 H 1.4464744989 0.5753487800 -1.2777277738 H 4.2052269209 -0.5006810498 0.6273842253 H -3.5105367021 1.1269924361 -1.0385757180 H -3.2139845544 -0.3877863898 -1.8909827792 H -4.3344913978 -0.3530156558 -0.5175190044 20 -33.85930301 C -2.3054406994 -0.0367685437 0.1244703160 C -0.9679739730 -0.7092899667 0.4375785069 N 0.0069971431 0.1737492914 0.7341885286 C 1.3765721332 -0.2209957659 0.8905430967 C 2.2804211715 0.2617113815 -0.2286184296 O 3.4762594651 0.1644084728 -0.2253932805 N -2.3962387688 1.2396265280 0.8254985470 O -0.8055603407 -1.9105188386 0.4042116757 O 1.5978426529 0.8214430390 -1.2323046179 H -2.4949325929 1.0869587311 1.8238656747 H -3.2137327858 1.7508888699 0.5098811336 C -2.3504306695 0.2040910244 -1.3856206956 H -3.1042837947 -0.7462881033 0.3964702877 H -0.2366662253 1.1530219266 0.6856470504 H 1.3938808891 -1.3183056386 0.8868817746 H 1.7941820982 0.1247463331 1.8424867969 H 2.2027922283 1.0919039359 -1.9409991997 H -3.3115562392 0.6251126223 -1.6741307573 H -1.5621828916 0.8966184623 -1.6708748868 H -2.2060138046 -0.7348844061 -1.9125018252 20 -33.85942329 C -2.2027818062 -0.0113484559 0.5737099977 C -0.9319242213 -0.7048879235 0.0793252674 N -0.1147064447 0.1261537826 -0.6013548102 C 1.1828715273 -0.2857161681 -1.0530557228 C 2.3273331889 0.2876443908 -0.2382534438 O 3.4848641910 0.2102872865 -0.5438207320 N -1.9137867200 1.3913510379 0.8547194102 O -0.7181300393 -1.8881029462 0.2343253223 O 1.9075110436 0.9008960103 0.8735413004 H -1.2976651491 1.4686494188 1.6576534800 H -2.7717185924 1.8864058780 1.0738213725 C -3.2590491131 -0.1075548015 -0.5281336353 H -2.5569969374 -0.5723955075 1.4543846628 H -0.3444214641 1.1100550878 -0.5593124060 H 1.3442839033 -0.0270472875 -2.1046767824 H 1.2212203747 -1.3784055633 -0.9568958882 H 2.6661909197 1.2279368336 1.3822738489 H -3.4043514972 -1.1466798425 -0.8095472482 H -4.2071214875 0.2978271949 -0.1802775640 H -2.9352313879 0.4562839232 -1.3993400602 20 -33.85965062 C -2.1169056167 -0.0676036070 0.6541719477 C -0.9430817542 -0.4472829984 -0.2571269403 N -0.0044977234 0.5178349895 -0.3491409109 C 1.2391276027 0.3061447930 -1.0327837044 C 2.4411227497 0.2303869979 -0.1102538529 O 3.5816607003 0.2127130268 -0.4829102066 N -2.1637201294 1.3792053800 0.8337926386 O -0.8705197429 -1.5103678970 -0.8360479967 O 2.0945269265 0.1903087180 1.1791964535 H -2.4928468754 1.8172751981 -0.0220919064 H -2.8246531736 1.6162907979 1.5658032364 C -3.4122109369 -0.6841985646 0.1202538030 H -1.8866286410 -0.5089383623 1.6329580868 H -0.1243676063 1.3272985997 0.2436791261 H 1.4282076499 1.0854220871 -1.7791795616 H 1.1551273383 -0.6558211705 -1.5545814394 H 2.8833928910 0.1144270621 1.7389115396 H -3.6925101277 -0.2085317762 -0.8184706212 H -3.2580374964 -1.7427948352 -0.0664300447 H -4.2228851128 -0.5601123084 0.8356336213 20 -33.85759111 C -2.1901233671 -0.0484028369 0.2033815383 C -0.6786959341 -0.2915451272 0.1389793351 N 0.0175133452 0.3612124901 1.0839553907 C 1.4421517152 0.1762950862 1.2203858584 C 2.0766384514 -0.1244736145 -0.1362358139 O 2.8202500599 -1.0385066525 -0.3499119811 N -2.4625451423 1.1695747192 0.9587981629 O -0.1532336345 -1.0136498895 -0.6843995221 O 1.8251874489 0.8494193732 -1.0184023469 H -3.4566058418 1.2459954478 1.1461356972 H -2.1987464758 1.9783206693 0.4025432865 C -2.7777198539 -0.0660111618 -1.2097995602 H -2.6067302443 -0.8877608247 0.7757673138 H -0.4975488760 0.9199760459 1.7488464928 H 1.6775807638 -0.6567895253 1.8930927017 H 1.8893107982 1.0963905828 1.6049016073 H 2.1915288823 0.6083193502 -1.8828728155 H -2.4190535165 0.7943742254 -1.7728174620 H -2.4575106514 -0.9678870894 -1.7229893177 H -3.8651605654 -0.0392229232 -1.1757929951 20 -33.85761730 C -2.1996812499 -0.2390528863 0.1312229691 C -0.6936574457 -0.4357021692 0.2853513785 N -0.0748014042 0.5940746208 0.8917658415 C 1.3637684319 0.6016727344 0.9989685670 C 1.9600586107 0.1484937942 -0.3330384230 O 1.6513165827 0.6010550030 -1.3999194247 N -2.6951755503 0.6337617166 1.1910489618 O -0.0990888457 -1.4117585797 -0.1246458897 O 2.9137220564 -0.7629133982 -0.1609008349 H -3.6585456064 0.8913986821 1.0040040507 H -2.6735511884 0.1516076273 2.0838051741 C -2.4360572498 0.4169278018 -1.2301409309 H -2.6654261368 -1.2385375527 0.1307330454 H -0.6358746852 1.3846237041 1.1699891659 H 1.7083841657 -0.0830405648 1.7811783190 H 1.7080044083 1.6144118624 1.2129594996 H 3.2386261254 -1.0797051899 -1.0184575777 H -1.9709237432 -0.1784299977 -2.0104964607 H -3.5022760373 0.4968013689 -1.4332992810 H -1.9986664999 1.4119853560 -1.2407468110 20 -33.86057136 C -2.3652699709 -0.1186682320 0.2537183779 C -0.9913401408 -0.7134161718 0.5615837466 N -0.0552632917 0.2262223487 0.8122386546 C 1.3422805700 -0.0921331311 0.8770471118 C 2.1014296959 0.5154274165 -0.2860676166 O 1.6147912742 1.2053099081 -1.1388894181 N -2.5099471531 1.1700574655 0.9202901342 O -0.7694562148 -1.9058073303 0.5695516824 O 3.3994703992 0.2050715202 -0.2436576090 H -3.3569138145 1.6310000423 0.6054741070 H -2.5800404275 1.0434919138 1.9244599079 C -2.4400914322 0.0783587155 -1.2619616203 H -3.1240555405 -0.8596536062 0.5563233396 H -0.3349650307 1.1891663221 0.6817113005 H 1.4335448907 -1.1832374378 0.8354442392 H 1.7948972830 0.2583721928 1.8115767326 H 3.8702500053 0.5951631584 -0.9976812442 H -2.2638527730 -0.8677904092 -1.7661145948 H -3.4218847478 0.4511662127 -1.5472152203 H -1.6857749723 0.7950179608 -1.5781215579 20 -33.86021928 C -2.2498703946 -0.2752003246 -0.1493621246 C -0.9413619798 -0.1875375828 0.6454356397 N 0.1026793743 -0.8488868153 0.0840967375 C 1.4477253875 -0.6662361619 0.5649756566 C 2.3357205478 -0.1089131537 -0.5276355009 O 1.9863126051 0.0586617689 -1.6648347214 N -3.4116279163 0.0196319865 0.6673446983 O -0.8535130992 0.4032236051 1.7015326761 O 3.5653059054 0.1537626421 -0.0876228092 H -3.4948659628 -0.6624728105 1.4143570226 H -3.2779757558 0.9237779751 1.1122056725 C -2.1760850955 0.7277534790 -1.3049471232 H -2.3633065849 -1.2903713163 -0.5478976153 H 0.0175538944 -1.1794091324 -0.8659372894 H 1.8797849071 -1.6110388164 0.9163791372 H 1.4120973532 0.0288729768 1.4103320842 H 4.1201040745 0.5065936306 -0.8020973853 H -2.0396626186 1.7324928385 -0.9109442810 H -1.3488299666 0.4991086631 -1.9721196137 H -3.1072474570 0.6994284428 -1.8635791510 20 -33.86021849 C -2.2520341742 -0.2833996477 -0.1193949827 C -0.9367744560 -0.1520411175 0.6580082578 N 0.0985560747 -0.8570484519 0.1346561004 C 1.4483353900 -0.6509539107 0.5923055743 C 2.3283860594 -0.1603797181 -0.5381628317 O 1.9699036173 -0.0583259706 -1.6802701315 N -3.4050403026 0.0760327048 0.6837309207 O -0.8369145213 0.5071984003 1.6716396074 O 3.5621572160 0.1242731114 -0.1242140857 H -3.2612986989 1.0076066630 1.0641707334 H -3.4871625446 -0.5522849445 1.4766240918 C -2.1801799929 0.6371343992 -1.3417513822 H -2.3761401991 -1.3226531433 -0.4462734791 H 0.0040226776 -1.2477629483 -0.7913193868 H 1.8808031495 -1.5751066433 0.9943328916 H 1.4216248785 0.0924550492 1.3958684288 H 4.1116194963 0.4338033648 -0.8624421477 H -2.0334814450 1.6655273521 -1.0188928033 H -1.3598951235 0.3569423595 -1.9977727300 H -3.1159159361 0.5775768771 -1.8901876354 20 -33.85745642 C -2.0988939428 -0.3515316703 0.3956355473 C -0.6525021938 -0.0681851959 -0.0250169087 N 0.0273607375 0.6810559930 0.8576667432 C 1.3924965286 1.0795352388 0.6113595503 C 2.1051718387 0.0509305569 -0.2660187085 O 2.6933981539 0.3137618785 -1.2758180775 N -2.5531835473 0.6868312011 1.3152914904 O -0.1509506445 -0.5112485699 -1.0386222266 O 2.1306918400 -1.1512380053 0.3184101027 H -3.4364368602 0.4166015357 1.7346641733 H -2.7174785359 1.5476388311 0.8003316400 C -2.9770960088 -0.5331884563 -0.8440530876 H -2.0690674730 -1.2941385086 0.9578518687 H -0.4732617781 1.0300777126 1.6624708963 H 1.9269438191 1.1375842041 1.5629900806 H 1.4421726503 2.0486199370 0.1007061846 H 2.5370353624 -1.7973258590 -0.2791038133 H -3.9705632606 -0.8795620540 -0.5657636314 H -3.0678403850 0.4110056236 -1.3789395716 H -2.5181332125 -1.2580092257 -1.5097669730 20 -33.85697346 C -2.1912988995 -0.1920605749 0.0425876710 C -0.6753278881 -0.3497323028 0.1580770453 N -0.0725303394 0.7049824376 0.7299762529 C 1.3521541070 0.7253197811 0.9560797243 C 2.0748742052 -0.0979396174 -0.1088631629 O 2.8750334655 -0.9574072859 0.1268227471 N -2.6818225172 0.6282644489 1.1460914484 O -0.0694265717 -1.3097072931 -0.2735009712 O 1.8213606593 0.3654025886 -1.3368797823 H -3.6552425253 0.8696208055 0.9923565985 H -2.6284000106 0.1125551677 2.0186947548 C -2.4886891704 0.5012384401 -1.2876449852 H -2.6250486555 -1.2052212256 0.0208558686 H -0.6642606015 1.4278890987 1.1143135459 H 1.6123248555 0.3092113035 1.9366971664 H 1.7119601266 1.7551951621 0.8871950359 H 2.2435105008 -0.2063927864 -1.9959931774 H -2.0354077841 -0.0559782124 -2.1025047373 H -3.5627308547 0.5565591461 -1.4541410682 H -2.0816445617 1.5091870931 -1.2774730984 20 -33.85973630 C -2.2035273954 0.4764356756 -0.0528000196 C -0.9601767950 -0.4156917332 -0.1698673386 N 0.0819571014 -0.0380065052 0.6109338411 C 1.3635567697 -0.6882702967 0.5192578239 C 2.4622635954 0.3244798665 0.2810936886 O 2.2998863676 1.5147739213 0.2841238852 N -3.1652841554 0.2119700129 -1.1055815296 O -0.9296080285 -1.4057355065 -0.8710872872 O 3.6457176298 -0.2576207261 0.0924030450 H -2.7505962819 0.4037851105 -2.0119394114 H -3.3986491524 -0.7772289940 -1.0968031719 C -2.8655367398 0.2359659486 1.3065896159 H -1.8973662553 1.5266892258 -0.1322119468 H 0.0818572995 0.8775496525 1.0356983279 H 1.3235906572 -1.4047330536 -0.3080203765 H 1.5988784486 -1.2429356884 1.4365622915 H 4.3385712160 0.4071605668 -0.0518143258 H -3.7787293440 0.8213854045 1.3655907293 H -3.1213460876 -0.8159215971 1.4119552049 H -2.2031173676 0.5199089365 2.1197314400 20 -33.85688151 C -2.1193213890 -0.4049908582 0.3584665200 C -0.6629928971 -0.2384092609 -0.0742660577 N -0.1120448442 0.9169005855 0.3296261794 C 1.2274406774 1.2878537714 -0.0575509173 C 2.0863812737 0.0451682677 -0.2868160092 O 2.7315352926 -0.1605345245 -1.2744689696 N -2.3565880832 0.3504418206 1.5842110922 O -0.0675989494 -1.0619487398 -0.7396660608 O 2.1551378656 -0.7084769766 0.8162330042 H -1.8737981326 -0.0848367221 2.3636891938 H -3.3477425846 0.3564688525 1.8009962525 C -3.0063249291 0.1399385168 -0.7622363826 H -2.3063500497 -1.4863433014 0.4642758731 H -0.6779877993 1.5345461436 0.8934950712 H 1.6872304970 1.8712064698 0.7442203032 H 1.2280302470 1.8719121412 -0.9854231640 H 2.6569873014 -1.5163427495 0.6287223931 H -4.0550223023 -0.0463788985 -0.5394243173 H -2.8554710517 1.2118457487 -0.8648269952 H -2.7551806194 -0.3467135179 -1.7004441213

{% endcapture %} {% include codecell.html content=struc_xyz %}

{% capture output_file %}
   ==============================================
   |                                            |
   |                 C R E S T                  |
   |                                            |
   |  Conformer-Rotamer Ensemble Sampling Tool  |
   |          based on the GFN methods          |
   |             P.Pracht, S.Grimme             |
   |          Universitaet Bonn, MCTC           |
   ==============================================
   Version 2.12,   Thu 19. Mai 16:32:32 CEST 2022

Using the xTB program. Compatible with xTB version 6.4.0

Cite work conducted with this code as

• P.Pracht, F.Bohle, S.Grimme, PCCP, 2020, 22, 7169-7192. • S.Grimme, JCTC, 2019, 15, 2847-2862.

and for works involving QCG as

• S.Spicher, C.Plett, P.Pracht, A.Hansen, S.Grimme, JCTC, 2022, 18 (5), 3174-3189.

with help from: C.Bannwarth, F.Bohle, S.Ehlert, S.Grimme, C.Plett, P.Pracht, S.Spicher

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Command line input:

crest struc.xyz --cregen input-ensemble.xyz

--cregen : CREGEN standalone usage. Sorting file "input-ensemble.xyz" Using only the cregen sorting routine. input file name : input-ensemble.xyz output file name : input-ensemble.xyz.sorted number of atoms : 20 number of points on xyz files : 20 RMSD threshold : 0.1250 Bconst threshold : 0.0100 population threshold : 0.0500 conformer energy window /kcal : 6.0000

fragment in coord : 1

bonds in reference structure : 19

number of reliable points : 20 reference state Etot : -33.8616567100000
running RMSDs... done. number of doubles removed by rot/RMSD : 2 total number unique points considered further : 18 Erel/kcal Etot weight/tot conformer set degen origin 1 0.000 -33.86166 0.21411 0.42820 1 2
2 0.000 -33.86166 0.21409
3 0.398 -33.86102 0.10936 0.10936 2 1
4 0.581 -33.86073 0.08039 0.16071 3 2
5 0.581 -33.86073 0.08031
6 0.681 -33.86057 0.06790 0.06790 4 1
7 0.902 -33.86022 0.04678 0.09351 5 2
8 0.902 -33.86022 0.04674
9 1.033 -33.86001 0.03748 0.03748 6 1
10 1.205 -33.85974 0.02806 0.02806 7 1
11 1.259 -33.85965 0.02563 0.02563 8 1
12 1.401 -33.85942 0.02015 0.02015 9 1
13 1.477 -33.85930 0.01774 0.01774 10 1
14 2.535 -33.85762 0.00298 0.00298 11 1
15 2.551 -33.85759 0.00290 0.00290 12 1
16 2.636 -33.85746 0.00251 0.00251 13 1
17 2.939 -33.85697 0.00151 0.00151 14 1
18 2.996 -33.85688 0.00137 0.00137 15 1
T /K : 298.15 E lowest : -33.86166 ensemble average energy (kcal) : 0.457 ensemble entropy (J/mol K, cal/mol K) : 19.222 4.594 ensemble free energy (kcal/mol) : -1.370 population of lowest in % : 42.820 number of unique conformers for further calc 15 list of relative energies saved as "crest.energies"


Wall Time Summary

          CREGEN wall time :         0h : 0m : 0s

Overall wall time : 0h : 0m : 0s

CREST terminated normally.

{% endcapture %} {% include codecell.html content=output_file %}

{% include defaulttab.html id="tab-id-1" %}

This program call will produce 3 files that are of main interest:

  • <input-file-name>.xyz.sorted, the sorted input ensemble with an ascending energy order of structures and all duplicates removed.

  • crest_ensemble.xyz, the file containing all unique conformers (i.e., rotamers sorted out).

  • crest.energies, a plaintext file containing the energies for all structures in crest_ensemble.xyz.

Taking a look at the output tab above reveals several things for our Ala-Gly example. First, a summary of sorting thresholds is provided. The most important point here is the energy window, which is by default set to 6 kcal/mol. All 20 structures of our ensemble are within this threshold and are analyzed. Two of these structures were duplicates of the same minimum and are removed. Three other structures are identified as rotamers of conformers 1, 3, and 5, respectively. This means, from the inital 20 structures, 18 conformers and rotamers were written to the input-ensemble.xyz.sorted file, and 15 unique conformers were written to crest_ensemble.xyz. The correct ascending energy order of the latter can again be seen in the figure below.

{% include image.html file="example-2-2.png" alt="Ala-Gly MOLDEN screenshot" caption="MOLDEN screenshot of the sorted Ala-Gly ensemble." max-width=500 %}

{% include tip.html content="Many options for adjusting sorting thresholds are available. See the Keyword Documentation for CREGEN." %}


Handling Topology in CREGEN

You might have noticed that Ala-Gly in the above example and Example 1 was presented in a neutral state and not as a zwitter ion. The zwitterionic species might have been generated during the MTD sampling since GFNn-xTB as an SQM method allows the free forming and breaking of bonds. However, the ensemble sorting procedure CREGEN by default checks the topology of all structures and removes those that do not match a given reference geometry (struc.xyz in our case). {: .text-justify }

You can try this by adding the following structure to the ensemble and executing the --cregen command again (see the example commands tab below).

{% include image.html file="example-2-3.png" alt="Ala-Gly zwitter ion" caption="Zwitterionic structure of Ala-Gly." max-width=300 %}

{{ site.data.icons.codefile }} zwitterion.xyz {{ site.data.icons.code }} example commands
{% capture struc_xyz %} 20 -33.818790361116 C -2.11918647262311 0.14895122905994 0.24492450423152 C -0.84613409018542 -0.71545902108420 0.09396449210643 N 0.01110878038306 -0.55798901361218 1.12345941613331 C 1.43207934792106 -0.57482696384029 0.88443865999916 C 1.88833437319196 0.82600233879956 0.32894440663697 O 0.97255546731960 1.64669100387907 0.07336052986943 N -1.68369973202269 1.50506796633395 -0.22286995339128 O -0.63218235961918 -1.39453299525113 -0.89224121288301 O 3.10015236848593 1.00313723593259 0.19263098642795 H -1.80652736708673 1.56915945577230 -1.24134242988971 H -0.62343017334745 1.63550171644745 -0.04221327393975 C -3.27633103833500 -0.36059678289799 -0.59454793916073 H -2.39948198144750 0.21252333754469 1.29833445377503 H -0.26102312521712 0.06347936661626 1.87388430943554 H 1.66922672788030 -1.32346211195377 0.12752055915879 H 1.98470863280314 -0.79708838903490 1.79613752154667 H -2.21430989711448 2.26244905004632 0.22131157546907 H -4.09390756675707 0.35408686586709 -0.60424410399088 H -2.94048851129317 -0.55788329271452 -1.60997845699847 H -3.63943941783663 -1.29716616081078 -0.18197510503603 {% endcapture %} {% include codecell.html content=struc_xyz %} {% include note.html content="The zwitterion is only stable at the GFN*n*-xTB level in combination with GBSA or ALPB. For consistency, structure was optimized with `--gbsa water` but the energy in was replaced by a gasphase singlepoint calculation at the GFN2-xTB level (-33.818790361116 *E*h)." %}
To add the zwitterion to the ensemble execute {% include command.html cmd="cat zwitterion.xyz >> input-ensemble.xyz" %} and then execute as before {% include command.html cmd="crest struc.xyz --cregen input-ensemble.xyz --ewin 30" %} {% include note.html content="The `--ewin 30` command was added here to increase the energy window to 30 kcal/mol. This is necessary because the zwitter ion has a much higher energy in the gasphase than the neutral Ala-Gly structures and would be sorted out simply by the energy threshold." %}
{% include defaulttab.html id="tab-id-2" %}

You will notice that the zwitterion is sorted out immediately, regardless of the energy threshold in CREGEN. This is due to the mismatching topology/connectivity of the reference struc.xyz and the newly added zwitterion.xyz. If you "reverse" the reference, i.e., by using {: .text-justify }

crest zwitterion.xyz --cregen input-ensemble.xyz

only the zwitter ion should remain because now it was taken for constructing the reference topology. {: .text-justify }

There is, however, a way to deal with topology-mismatches if required: {: .text-justify }

All topology checks in CREGEN can be ignored by adding the --notopo command to the program call. {: .text-justify }

crest zwitterion.xyz --cregen input-ensemble.xyz --ewin 30 --notopo 

With this command any isomer of the reference structure in the ensemble file will be considered, as long as it satisfies the format requiements from the Input Formats section.