- DTASimplifier
- TopFSimplifier
- DTAChopper
- TopFChopper
- SATZKLAMMERtoTopF
- TSVIndexer
- CoNLLUPLUSIndexer
- TUEBADSTopFExtractor
- HIPKONtoSTTSMapper
- addmissingSTTStoHIPKON
- HiTStoSTTSMapper
- ANSELMtoSTTSMapper
- ReFHiTStoSTTSMapper
- MercuriusToSTTSMapper
- ReFUPToSTTSMapper
- FuerstinnentoSTTSMapper
- VirgelMapper
- PronominalAdverbMapper
- ReFUPCoding
- BracketRemover
- DependencyProcessor
- DependencyManipulator
- TreeToBIOProcessor
- Removes unneeded annotation columns and simplifies some annotations from the DTA corpus.
- Doc-Object that contains tokens with annotations from the DTA Corpus.
- Doc-Object that contains tokens with simplified columns.
- The remaining columns are:
ID, FORM, XPOS, LEMMA, OrthCorr, Cite, Antec, AntecHead, SentBrckt, MovElem, MovElemPos, RelCType, AdvCVPos, AdvCVHead
- Removes unneeded annotation columns, simplifies the topological field annotation and creates a sentence bracket column.
- Doc-Object that contains tokens with TopF annotation.
- Doc-Object that contains tokens with simplified columns.
- The remaining columns are:
ID, FORM, XPOS, LEMMA, FEATS, DEPREL, HEAD, CHUNK, TopF, SentBrckt
- Removes unannotated sentences and reindexes the sentences and character offsets.
- Doc-Object that contains tokens that have at least a MovElemCat, TSVID, CHARS and FORM attribute.
- Doc-Object that contains only sentences with relevant MovElemCat annotation.
- Removes unannotated sentences.
- Doc-Object that contains tokens that have at least a TopF attribute.
- Doc-Object that contains only sentences with relevant TopF annotation.
- Maps the attribute 'SATZKLAMMER' from HIPKON to the corresponding topological field.
- Doc-Object that contains tokens that have at least a SATZKLAMMER attribute.
- Doc-Object that contains tokens with added TopF attribute.
- Adds the sentence and word index and the character offsets for the WebAnno TSV Format.
- Doc-Object that contains tokens that have at least a FORM attribute.
- Doc-Object that contains tokens with added TSVID and CHARS attribute.
- Adds the sentence and word index for the CoNLL-U Plus Format.
- Doc-Object that contains sentences with tokens.
- Doc-Object that contains sentences with added sent_id attribute and tokens with added ID attribute.
- Extracts the topological field information from TueBa-D/S.
- Doc-Object that contains tokens with at least a POS:HD and SYNTAX attribute.
- Doc-Object that contains tokens with added PHRASE:HEAD and TopoField attribute.
- Maps the POS-Tags from HIPKON to their corresdponding STTS-Tags.
- Text-file of the form
POS-Tag\tSTTS-Tag
that contains the rules for the mapping. - Doc-Object that contains tokens that have at least a FORM and POS attribute.
- Doc-Object that contains tokens with added XPOS attribute for the STTS-Tags.
- CSV-File with tokens that still don't have an STTS-Tag and their exact location.
- List of rules which effectively were used during the mapping.
- Maps the POS-Tags of the remaining tokens after processing with the HIPKONtoSTTSMapper to their corresponding STTS-Tags.
- CSV-File of the form
Token\tFilename\tSent_id\tTok_id\tSTTS-Tag
that contains the STTS-Tags of the remaining tokens. - Doc-object that contains the tokens.
- Doc-Object where all tokens have an XPOS attribute with their STTS-Tag.
- Maps the HiTS-Tags from ReM to their corresponding STTS-Tags.
- CSV-File of the form
POS-Tag\tPOSLEMMA-Tag\tCount\tCandidates\tSTTS-Tag\tRemarks
that contains the rules for the mapping. - Doc-Object that contains tokens that have at least ID, FORM, POS, POS_GEN and PUNC attributes.
- Doc-Object that contains tokens with added XPOS attribute for the STTS-Tags.
- Maps the POS-Tags from Anselm to their corresponding STTS-Tags.
- CSV-File of the form
POS-Tag\tSTTS-Tag
that contains the rules for the mapping. - Doc-Object that contains tokens that have at least a POS attribute.
- Doc-Object that contains tokens with added XPOS attribute for the STTS-Tags.
- Maps the HiTS-Tags from ReF.BO to their corresponding STTS-Tags.
- CSV-File of the form
POS-Tag\tPOSLEMMA-Tag\tSTTS-Tag
that contains the rules for the mapping. - Doc-Object that contains tokens that have at least ID, FORM, POS and POS_LEMMA attributes.
- Doc-Object that contains tokens with added XPOS attribute for the STTS-Tags.
- Maps the POS-Tags from Mercurius to their corresponding STTS-Tags.
- CSV-file of the form
POS-Tag\tSTTS-Tag\tComments
that contains the rules for the mapping. - Doc-Object that contains tokens that have at least ID, FORM and POS attributes.
- Doc-Object that contains tokens with added XPOS attribute for the STTS-Tags.
- Maps the POS-Tags from ReF.UP to their corresponding STTS-Tags.
- CSV-file of the form
POS-Tag\tSTTS-Tag
that contains the rules for the mapping. - Doc-Object that contains tokens that have at least ID, FORM and POS attributes.
- Doc-Object that contains tokens with added XPOS attribute for the STTS-Tags.
- Maps the POS-Tags from Fuerstinnenkorrespondenz to their corresponding STTS-Tags.
- CSV-file of the form
POS-Tag\tSTTS-Tag
that contains the rules for the mapping. - Doc-Object that contains tokens that have at least a POS and LEMMA attribute.
- Doc-Object that contains tokens with added XPOS attribute for the STTS-Tags.
- Maps all Virgel (token of the form "/") in a document to the XPOS-Tag "$(".
- Doc-Object that contains tokens that have at least a FORM attribute.
- Doc-Object that contains tokens with updated XPOS-Tag for Virgel.
- Maps all Pronominal Adverbs with XPOS-tag "PROAV" to the XPOS-Tag "PAV".
- Doc-Object that contains tokens that have at least an XPOS attribute.
- Doc-Object that contains tokens with updated XPOS-Tag for Pronominal adverbs.
- Corrects the coding of "ß" in ReF.UP.
- Doc-Object that contains tokens that have at least a FORM attribute.
- Doc-Object that contains tokens with updatet coding.
- Removes all forms of brackets from the tokens (except of punctuation-token).
- Doc-Object that contains tokens that have at least a FORM attribute.
- Doc-Object that contains tokens without brackets in their word form.
- Extracts the dependency relations of all tokens.
- Doc-Object that contains tokens with dependency annotations (at least ID and HEAD attributes).
- Doc-Object with added head_tok attribute (for the parent) and dep_toks attribute (for the children) for each token and added roots attribute (list of all root tokens) for each sentence.
- Maps dependency annotations to a different dependency annotation scheme.
- Doc-Object that contains tokens with dependency annotations (at least ID, HEAD and DEPREL attributes).
- Doc-Object with changed dependency annotations of the tokens.
- Converts a tree object to stacked BIO annotations.
- Doc-Object that contains sentences with a tree object as attribute
tree
.
- Doc-Object with BIO annotations as
TREE
attribute of the tokens.