Skip to content

feat: introduce @list.sort_v2 to improve sort stack safety #2154

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

Erchiusx
Copy link
Contributor

Previous @list.sort would yield a stack overflow at a data scale of 100_00
introduced @list.sort_v2 which uses @array.FixedArray::sort to help sorting the list

Copy link

peter-jerry-ye-code-review bot commented May 23, 2025

Stack overflow fix by avoiding recursive implementation

Category
Performance
Code Snippet
Previous:

smaller.sort().concat(More(x, tail=greater.sort()))

New:

let arr = self.to_array()
arr.sort()
from_array(arr)

Recommendation
The new implementation is correct. Consider adding a comment explaining why array-based sorting is preferred over recursive list sorting.
Reasoning
The original recursive implementation would create O(n) stack frames for large lists, causing stack overflow. The array-based approach is both more space-efficient and faster.

Test case could be more comprehensive

Category
Maintainability
Code Snippet

test "sort-stack-safety" {
  let arr = FixedArray::makei(100_000, fn { x => x })
  arr.rev_inplace()
  let ls = @list.of(arr)
  ls.sort() |> ignore
}

Recommendation
Add assertions to verify the sorting result is correct, not just that it completes without stack overflow
Reasoning
Current test only verifies the function doesn't crash but doesn't validate correctness. Adding result verification would make the test more robust.

Documentation needs updating

Category
Maintainability
Code Snippet

/// assert_eq(ls, @list.from_array([-76, -6, 0, 1, 3, 6, 52, 123]))

Recommendation
Add a note about the performance characteristics of the new implementation in the function documentation
Reasoning
Documentation should reflect significant implementation details that affect performance characteristics, especially when they differ from typical list operations

@coveralls
Copy link
Collaborator

coveralls commented May 23, 2025

Pull Request Test Coverage Report for Build 6927

Details

  • 3 of 3 (100.0%) changed or added relevant lines in 1 file are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage decreased (-0.002%) to 92.452%

Totals Coverage Status
Change from base Build 6925: -0.002%
Covered Lines: 8684
Relevant Lines: 9393

💛 - Coveralls

@peter-jerry-ye
Copy link
Collaborator

I think you can either:

  • replace the existing implementation altogether
  • try to find a boundary where the original is more efficient (which might not even exist) and keep both

@Erchiusx
Copy link
Contributor Author

I think you can either:

* replace the existing implementation altogether

* try to find a boundary where the original is more efficient (which might not even exist) and keep both

The original implementation has a 3-constant factor, I think replacing it would count

@peter-jerry-ye peter-jerry-ye requested a review from Lampese May 23, 2025 07:05
Copy link
Collaborator

@Lampese Lampese left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@FlyCloudC
Copy link
Contributor

The test seems a bit oversized, which may slow down the standard library's test suite. If it's for correctness, 100,000 might be overkill, and if it's only for speed measurements, maybe we can delete it after?

@bobzhang
Copy link
Contributor

bobzhang commented May 24, 2025

The test seems a bit oversized, which may slow down the standard library's test suite. If it's for correctness, 100,000 might be overkill, and if it's only for speed measurements, maybe we can delete it after?

sorting 100_000 elements should be fine, we may add some attributes to allow test filtering mechanism in the future

@Erchiusx can you do a rebase to keep the history clean

@Erchiusx
Copy link
Contributor Author

100_000 is not a must, 100_00 is enough to show the difference

perhaps I should change it to 100_00?

@Erchiusx
Copy link
Contributor Author

The rebase is done, but seems still squashable

@Lampese
Copy link
Collaborator

Lampese commented May 24, 2025

Later on, I will establish a core-extra-test repo with interns from @plct-lab to conduct some stress tests on the core. The testing in the core can be relatively reduced. cc @CAIMEOX

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants