Skip to content

Conversation

dbnski
Copy link

@dbnski dbnski commented Oct 14, 2019

The original code uses signed char type for variables that hold the character being processed. It breaks 8-bit ASCII characters used in many European languages, but also prevents the library from correctly handling unescaped UTF8 characters. The attached patch addresses both these problems and further improves support for UTF8-encoded strings. I am aware that updating all methods wasn't necessary to make this work, but I thought perhaps it was a good idea to still do so for consistency.

@dbnski
Copy link
Author

dbnski commented Oct 14, 2019

It was just something quick I put together while trying to use your library in a PoC, but I see it needs some polishing to be able to pass through tests. If you are interested in merging this patch, I can update the PR when I have some spare time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant