-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rework string length rules #84
Comments
Yeah, I agree that the GitHub code search shows that I looked into it a bit more, and I believe To summarize, I think we should:
For most users, this should not cause much if any churn, and for those who explicitly allow unicode in their strings now have the option to have a much more reliable metric for "character count". |
Yes, that sounds great! At the moment, I'm working on updating the |
Hello! Thank you for your crate. I highly appreciate it. However, I've noticed one aspect throughout the entire crate that doesn't resonate with me - specifically, the implementation of the string
length
rule. In my view, this rule could be misleading because:(Referenced from here)
One reason I admire Rust is its handling of strings: by default, it provides size in bytes, and additionally, you can get the size in Unicode Scalar Values using the
.chars().count()
methods, which doesn't always(!) match your idea of what a "character" (as mentioned above). It seemed unfamiliar to me that thelength
rule doesn't return the size in bytes.Therefore, I would like to suggest a new approach for strings:
length
(.chars().count()
) implementation to thechar_count
rulegrapheme_count
rule that will available with own feature flag because it will require a crate for working with graphemes (e.g. unicode-segmentation).byte_length
implementation to thelength
ruleDownsides of the optional solution: it will significantly break backward compatibility.
The text was updated successfully, but these errors were encountered: