You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Call clean outside of getSentences (e.g. getTokens) so that SalienceAnalyzer can analyze dirty sentences (and thus show eventual overfitting to garbage).
NN: call clean and/or lowercase only in InputRepresentation
Replace the call to getTokens in TokenIterator to tokenize
We do cleaning in several places in
DataUtilities
:getSentences()
(called fromSentenceIterator
andgetTokens
)processTextReduced()
(called fromCharacterTrigram
, deprecated)cleanText()
(called fromPatient.getCleanedText()
, used only by RBC)LSTMClassifier.initializeTruncateLength
(removed in d524666)DRY
The text was updated successfully, but these errors were encountered: