You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Here, confidence is the model's (self-reported) confidence in its prediction, calculated as
84
+
$$
85
+
\text{confidence}_c^{m_i} = 2|p_c^{m_i} - 0.5|
86
+
$$
87
+
For example, if a model makes a positive prediction with $p_c^{m_i} = 0.55$, the confidence is $2|0.55 - 0.5| = 0.1$.
88
+
One could say that the model is not very confident in its prediction and very close to switching to a negative prediction.
89
+
If another model is very sure about its negative prediction with $p_c^{m_j} = 0.1$, the confidence is $2|0.1 - 0.5| = 0.8$.
90
+
Therefore, if in doubt, we are more confident in the negative prediction.
91
+
92
+
Confidence can be disabled by the `use_confidence` parameter of the predict method (default: True).
93
+
94
+
The model_weight can be set for each model in the configuration file (default: 1). This is used to favor a certain
95
+
model independently of a given class.
96
+
Trust is based on the model's performance on a validation set. After training, we evaluate the Machine Learning models
97
+
on a validation set for each class. If the `ensemble_type` is set to `wmv-f1`, the trust is calculated as 1 + the F1 score.
98
+
If the `ensemble_type` is set to `mv` (the default), the trust is set to 1 for all models.
99
+
100
+
3. After a decision has been made for each class independently, the consistency of the predictions with regard to the ChEBI hierarchy
101
+
and disjointness axioms is checked. This is
102
+
done in 3 steps:
103
+
- (1) First, the hierarchy is corrected. For each pair of classes $A$ and $B$ where $A$ is a subclass of $B$ (following
104
+
the is-a relation in ChEBI), we set the ensemble prediction of $B$ to 1 if the prediction of $A$ is 1. Intuitively
105
+
speaking, if we have determined that a molecule belongs to a specific class (e.g., aromatic primary alcohol), it also
106
+
belongs to the direct and indirect superclasses (e.g., primary alcohol, aromatic alcohol, alcohol).
107
+
- (2) Next, we check for disjointness. This is not specified directly in ChEBI, but in an additional ChEBI module ([chebi-disjoints.owl](https://ftp.ebi.ac.uk/pub/databases/chebi/ontology/)).
108
+
We have extracted these disjointness axioms into a CSV file and added some more disjointness axioms ourselves (see
109
+
`data>disjoint_chebi.csv` and `data>disjoint_additional.csv`). If two classes $A$ and $B$ are disjoint and we predict
110
+
both, we select one of them randomly and set the other to 0.
111
+
- (3) Since the second step might have introduced new inconsistencies into the hierarchy, we repeat the first step, but
112
+
with a small change. For a pair of classes $A \subseteq B$ with predictions $1$ and $0$, instead of setting $B$ to $1$,
113
+
we now set $A$ to $0$. This has the advantage that we cannot introduce new disjointness-inconsistencies and don't have
Copy file name to clipboardExpand all lines: chebifier/cli.py
+1Lines changed: 1 addition & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -26,6 +26,7 @@ def cli():
26
26
@click.option('--output', '-o', type=click.Path(), help='Output file to save predictions (optional)')
27
27
@click.option('--ensemble-type', '-e', type=click.Choice(ENSEMBLES.keys()), default='mv', help='Type of ensemble to use (default: Majority Voting)')
28
28
@click.option("--chebi-version", "-v", type=int, default=241, help="ChEBI version to use for checking consistency (default: 241)")
29
+
@click.option("--use-confidence", "-c", is_flag=True, default=True, help="Weight predictions based on how 'confident' a model is in its prediction (default: True)")
0 commit comments