You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Need to train models for each major entity class: PRGE, LIVB, DISO, CHED. The first three are fairly straight-forward. As for the last, there are multiple levels of granularity to the entity annotations, for now, might just cheat and collapse everything under the CHED tag.
For relations, we are at the mercy of what datasets are available. Right now, we could train a model for adverse drug events using the ADE corpus.
There should be a base and large version for each model. In the case of BERT, this would correspond to whether the BERT base or large model was used. Any model not implemented should raise a NotImplementedError (see #155).
Finally, the model names should follow a convention. Maybe [model-name]-[entity or relation]-[base or large], e.g. bert-for-ner-prge, bert-for-ner-prge-lg. See PyTorch Transformers or SpaCy for inspiration.
BERT
Entities
Train PRGE-base
Train PRGE-large
Train LIVB-base
Train LIVB-large
Train DISO-base
Train DISO-large
Train CHED-base
Train CHED-large
Relations
Train ADE
The text was updated successfully, but these errors were encountered:
I am currently working on a review of taxon mentions recognition tools for ecological information extraction, and I have just discovered Saber which I'd like too include as an example of state-of-the-art deep learning-based approach.
Unfortunately, it seems that the LIVB pre-trained model does not exist at the moment. Any idea when it might be available? Or should I consider training my own model?
Thanks for your interest. Unfortunately, we are no longer maintaining the project. I would suggest checking out AllenNLP, Transformers or ScispaCy for state-of-the-art NER. ScispaCy has pretrained models that will detect organism names (see the model trained on BIONLP13CG specifically).
Need to train models for each major entity class:
PRGE
,LIVB
,DISO
,CHED
. The first three are fairly straight-forward. As for the last, there are multiple levels of granularity to the entity annotations, for now, might just cheat and collapse everything under theCHED
tag.For relations, we are at the mercy of what datasets are available. Right now, we could train a model for adverse drug events using the ADE corpus.
There should be a
base
andlarge
version for each model. In the case of BERT, this would correspond to whether the BERTbase
orlarge
model was used. Any model not implemented should raise aNotImplementedError
(see #155).Finally, the model names should follow a convention. Maybe
[model-name]-[entity or relation]-[base or large]
, e.g.bert-for-ner-prge
,bert-for-ner-prge-lg
. See PyTorch Transformers or SpaCy for inspiration.BERT
Entities
PRGE-base
PRGE-large
LIVB-base
LIVB-large
DISO-base
DISO-large
CHED-base
CHED-large
Relations
ADE
The text was updated successfully, but these errors were encountered: