[BUG] Incorrect String comparison implementation #2575

quake · 2025-07-30T09:11:42Z

quake
Jul 30, 2025

Describe the bug
The current String::compare implementation incorrectly compares strings by length first, then by content, which violates the standard lexicographical ordering used by most programming languages.

To Reproduce

fn main {
  let a = "abc"
  let b = "b"
  println(a.compare(b)) // outputs 1
}

Expected behavior
outputs -1

Additional context

javascript

"abc".localeCompare("b") // returns -1 (abc < b)

python

"abc" < "b"   # True (lexicographical comparison)

rust

"abc".cmp("b")   // returns Ordering::Less

peter-jerry-ye · 2025-07-31T02:02:36Z

peter-jerry-ye
Jul 31, 2025
Maintainer

This is not a bug but a design choice. You can argue that you have the lexicographical easily with ascii, but what about other languages? What about the same character with different locale?

So in short, what we provide here for String and many other types in the core (like Array List, you name it) all compare with the size first and then the content. The main benefit is the speed. And the Compare only requires a stable order. If there are specific ordering (for example, lexicographical order), they can be implemented using newtype.

0 replies

quake · 2025-07-31T08:36:46Z

quake
Jul 31, 2025
Author

Thank you for the clarification.

I understand that this is an intentional design choice for performance reasons, and that Compare only requires a stable order. However, I believe this decision sacrifices a widely expected behavior in favor of a marginal optimization, and may create confusion or frustration for users coming from other languages.

Most modern languages, such as JavaScript, Python, Rust, Java, Go, etc, define string and array comparison using standard lexicographical(unicode code points) order. This is what developers naturally expect when using comparison operators or sorting functions. Deviating from this convention introduces cognitive overhead, especially for beginners or users trying to port existing logic from other ecosystems.

If performance is a concern, a better approach would be to provide an option or parameter to compare that allows customizing the comparison logic, for example, by first comparing length, then falling back to the standard comparison. But the default behavior of Compare should align with established conventions, especially for foundational types like String and Array, where developers reasonably expect lexicographical ordering.

Designing a language involves trade-offs, but consistency with widely understood behaviors should not be underestimated, especially when it comes to something as fundamental as string comparison.

I hope you will reconsider this decision.

0 replies

peter-jerry-ye · 2025-07-31T09:19:27Z

peter-jerry-ye
Jul 31, 2025
Maintainer

The principle is to avoid adding burden to people for what they do not wish to pay for. We assume that in many cases, e.g. binary search, many people just need an ordering. It would be great if you could provide concrete real world example where the lexicographical ordering is necessarily needed, apart from String which, as I mentioned, requires much more information and can't be simply handled with a compare(a, b)

0 replies

peter-jerry-ye · 2025-07-31T09:28:10Z

peter-jerry-ye
Jul 31, 2025
Maintainer

I do think though, that wherever T : Compare trait bound is added, there should be another function / method that accepts a custom sort function, which is not always the case now.

0 replies

quake · 2025-07-31T11:03:06Z

quake
Jul 31, 2025
Author

It would be great if you could provide concrete real world example where the lexicographical ordering is necessarily needed

Let's take segment version comparing as example: in other programming languages, it is straightforward, you can split the version string by ".", convert each part to an integer, and then compare the resulting array / vector directly using comparing operators.

other PL, output true, rust for example:

    let a = vec![1, 3, 1]; // "1.3.1"
    let b = vec![2, 0]; // "2.0"
    println!("{}", a < b);

moonbit, output false,

    let a = [1, 3, 1] // "1.3.1"
    let b = [2, 0] // "2.0"
    println(a < b)

Beyond segment version comparison, there are many real-world scenarios that rely on comparing strings, arrays, or vectors by comparing elements in order before considering length. Examples include Trie data structures used for prefix matching, high-performance route searches in networking, autocomplete systems that suggest entries, and key-value store prefix iterators that efficiently scan keys sharing common prefixes. In all these cases, the comparison semantics depend on element-wise lexicographical ordering rather than simply comparing lengths first. Using length-first comparison would break the correctness and performance of these fundamental algorithms and data structures.

This difference in the built-in Compare semantics for builtin types(strings, arrays, and vectors) also poses unnecessary challenges for Moonbit’s goal of leveraging AI to port libraries from other languages to build its ecosystem. Since Moonbit’s default comparison behavior differs from that of most mainstream languages, code relying on standard lexicographical ordering will require additional adaptation, increasing complexity and reducing compatibility.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[BUG] Incorrect String comparison implementation #2575

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 5 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

[BUG] Incorrect String comparison implementation #2575

Uh oh!

Uh oh!

quake Jul 30, 2025

Replies: 5 comments

Uh oh!

Uh oh!

peter-jerry-ye Jul 31, 2025 Maintainer

Uh oh!

quake Jul 31, 2025 Author

Uh oh!

peter-jerry-ye Jul 31, 2025 Maintainer

Uh oh!

peter-jerry-ye Jul 31, 2025 Maintainer

Uh oh!

quake Jul 31, 2025 Author

quake
Jul 30, 2025

peter-jerry-ye
Jul 31, 2025
Maintainer

quake
Jul 31, 2025
Author

peter-jerry-ye
Jul 31, 2025
Maintainer

peter-jerry-ye
Jul 31, 2025
Maintainer

quake
Jul 31, 2025
Author