feat(string): add find_by_charcode #2084

Yu-zh · 2025-05-12T07:00:07Z

Motivation: find_by_charcode will be more efficient than find_by because it skips the surrogate pair checks. This API is used for performance-critical applications.

Additionally, I'm not sure whether we want to expose an API that returns Iter[Int] or similar so that users care about performance can implement customized high performance string operations.

peter-jerry-ye-code-review · 2025-05-12T07:00:28Z

Direct duplication of find_by_charcode implementation between String and View

Category
Performance
Code Snippet
pub fn String::find_by_charcode and pub fn View::find_by_charcode have identical implementations
Recommendation
Consider implementing String::find_by_charcode in terms of View::find_by_charcode like:

pub fn String::find_by_charcode(self : String, pred : (Int) -> Bool) -> Int? {
  self[:].find_by_charcode(pred)
}

Reasoning
Duplicated code increases maintenance burden. Since String can be viewed as a View, we can reuse the View implementation to maintain consistency and reduce code duplication, similar to how the original find_by was implemented

Missing documentation about performance implications and usage guidance

Category
Maintainability
Code Snippet
/// Find the index of the first charcode that satisfies the given predicate.
Recommendation
Expand documentation to explain performance benefits and limitations:

/// Find the index of the first charcode that satisfies the given predicate.
/// This is a low-level API that operates on raw character codes for performance.
/// Note: Use with caution as it bypasses Unicode character boundaries.
/// For most cases, prefer find_by unless performance is critical.

Reasoning
The current documentation doesn't explain why this API exists alongside find_by or when to use one over the other. Adding clear guidance helps developers make informed decisions

Test cases could be more comprehensive for edge cases

Category
Correctness
Code Snippet
test "find_by_charcode" {
Recommendation
Add test cases for:

Invalid UTF-8 sequences
Strings with mixed surrogate pairs
Performance comparison with find_by
Boundary conditions with very large strings
Reasoning
While the current test suite is good, adding these cases would help verify the correctness and performance benefits in various scenarios that might occur in production use

coveralls · 2025-05-12T07:02:09Z

Pull Request Test Coverage Report for Build 6727

Details

6 of 6 (100.0%) changed or added relevant lines in 1 file are covered.
1 unchanged line in 1 file lost coverage.
Overall coverage increased (+0.02%) to 92.463%

Files with Coverage Reduction	New Missed Lines	%
string/view.mbt	1	84.51%

Totals
Change from base Build 6725:	0.02%
Covered Lines:	6146
Relevant Lines:	6647

💛 - Coveralls

bobzhang force-pushed the string-api branch from dc99668 to 1f1d52e Compare May 13, 2025 01:29

Yu-zh added 2 commits May 16, 2025 10:21

feat(string): add find_by_charcode

aeac773

fmt

f6d68a1

Yu-zh force-pushed the string-api branch from 1f1d52e to f6d68a1 Compare May 16, 2025 03:04

Yu-zh requested a review from bobzhang May 16, 2025 06:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(string): add find_by_charcode #2084

feat(string): add find_by_charcode #2084

Uh oh!

Yu-zh commented May 12, 2025

Uh oh!

peter-jerry-ye-code-review bot commented May 12, 2025 •

edited

Loading

Uh oh!

coveralls commented May 12, 2025 •

edited

Loading

Uh oh!

Uh oh!

feat(string): add find_by_charcode #2084

Are you sure you want to change the base?

feat(string): add find_by_charcode #2084

Uh oh!

Conversation

Yu-zh commented May 12, 2025

Uh oh!

peter-jerry-ye-code-review bot commented May 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coveralls commented May 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Test Coverage Report for Build 6727

Details

💛 - Coveralls

Uh oh!

Uh oh!

peter-jerry-ye-code-review bot commented May 12, 2025 •

edited

Loading

coveralls commented May 12, 2025 •

edited

Loading