Skip to content

Conversation

@evaline-ju
Copy link
Collaborator

The regex sentence splitter is not a very accurate sentence splitter but we would like to provide an initial implementation of aggregation and splitting for bidirectional streaming use, in the case of streaming text chunks/tokens needing to be aggregated to sentences for further sentence analysis.

For tracking purposes, output streamed sentences remain directly concatenable.

Closes: #345

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Initial bidirectional streaming tokenization on regex sentence splitter

1 participant