Normalize German variants #87

michelole · 2019-08-28T17:21:05Z

Normalize, e.g., c <-> k, z <-> c before applying filtering rules.

michelole · 2019-08-29T12:42:21Z

Maybe normalize all to k also during training to make models denser?

michelole · 2019-12-02T17:37:14Z

If normalizing before training, cleaning routines should be applied to the GS (at runtime) to avoid false negatives.

michelole · 2019-12-03T18:06:46Z

This actually may require annotating the new expansions, since some of them could be considered typos, e.g. "becannt"/"druccausgleich", "karotis"/"kava".

Spelling variants are better handled with a normalization step instead of an exponential increase of expansion candidates, which led to very slow processing and several bugs. This refs bst-mug#87 and closes bst-mug#98. Also, `get_acro_def_pair_score` was originally intended for web-based (i.e. text with acronym-definition pairs) inputs, now removed.

michelole added the P1 Higher priority issues, a SHOULD label Aug 28, 2019

michelole changed the title ~~Revisit variants~~ Normalize variants Nov 22, 2019

michelole changed the title ~~Normalize variants~~ Normalize German variants Nov 22, 2019

michelole mentioned this issue Dec 3, 2019

German Stemmer #123

Open

michelole removed the P1 Higher priority issues, a SHOULD label Dec 3, 2019

michelole mentioned this issue Dec 3, 2019

Soundex #124

Open

michelole added the P2 High priority issues, a COULD label Dec 3, 2019

michelole mentioned this issue Dec 3, 2019

Normalize German umlaut #125

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Normalize German variants #87

Normalize German variants #87

michelole commented Aug 28, 2019 •

edited

Loading

michelole commented Aug 29, 2019 •

edited

Loading

michelole commented Dec 2, 2019

michelole commented Dec 3, 2019

Normalize German variants #87

Normalize German variants #87

Comments

michelole commented Aug 28, 2019 • edited Loading

michelole commented Aug 29, 2019 • edited Loading

michelole commented Dec 2, 2019

michelole commented Dec 3, 2019

michelole commented Aug 28, 2019 •

edited

Loading

michelole commented Aug 29, 2019 •

edited

Loading