-
Notifications
You must be signed in to change notification settings - Fork 106
Add 'indirect_sort' #117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Add 'indirect_sort' #117
Changes from 1 commit
17c47e8
814f8a5
3ae9ee2
8be54b3
a7ae53d
25ab833
62922bd
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,71 @@ | ||||||
[/ File indirect_sort.qbk] | ||||||
|
||||||
[section:indirect_sort indirect_sort ] | ||||||
|
||||||
[/license | ||||||
Copyright (c) 2023 Marshall Clow | ||||||
|
||||||
Distributed under the Boost Software License, Version 1.0. | ||||||
(See accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) | ||||||
] | ||||||
|
||||||
There are times that you want a sorted version of a sequence, but for some reason or another, you don't really want to sort them. Maybe the elements in the sequence are non-copyable (or non-movable), or the sequence is const, or they're just really expensive to move around. An example of this might be a sequence of records from a database. | ||||||
|
||||||
Nevertheless, you might want to sort them. That's where indirect sorting comes in. In a "normal" sort, the elements of the sequence to be sorted are shuffled in place. In indirect sorting, the elements are unchanged, but the sort algorithm returns to you a "permutation" of the elements that, when applied, will leave the elements in the sequence in a sorted order. | ||||||
|
||||||
Say you have a sequence `[first, last)` of 1000 items that are expensive to swap: | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
``` | ||||||
std::sort(first, last); // ['O(N ln N)] comparisons and ['O(N ln N)] swaps (of the element type). | ||||||
``` | ||||||
|
||||||
On the other hand, using indirect sorting: | ||||||
``` | ||||||
auto permutation = boost::algorithm::indirect_sort(first, last); // ['O(N lg N)] comparisons and ['O(N lg N)] swaps (of size_t). | ||||||
boost::algorithm::apply_permutation(first, last, perm.begin(), perm.end()); // ['O(N)] swaps (of the element type) | ||||||
``` | ||||||
|
||||||
If the element type is sufficiently expensive to swap, then 10,000 swaps of size_t + 1000 swaps of the element_type could be cheaper than 10,000 swaps of the element_type. | ||||||
|
||||||
Or maybe you don't need the elements to actually be sorted - you just want to traverse them in a sorted order: | ||||||
``` | ||||||
auto permutation = boost::algorithm::indirect_sort(first, last); | ||||||
for (size_t idx: permutation) | ||||||
std::cout << first[idx] << std::endl; | ||||||
``` | ||||||
|
||||||
|
||||||
More to come here .... | ||||||
|
||||||
[heading interface] | ||||||
|
||||||
The function `indirect_sort` a `vector<size_t>` containing the permutation necessary to put the input sequence into a sorted order. One version uses `std::less` to do the comparisons; the other lets the caller pass predicate to do the comparisons. | ||||||
|
||||||
``` | ||||||
template <typename RAIterator> | ||||||
std::vector<size_t> indirect_sort (RAIterator first, RAIterator last); | ||||||
|
||||||
template <typename RAIterator, typename BinaryPredicate> | ||||||
std::vector<size_t> indirect_sort (RAIterator first, RAIterator last, BinaryPredicate pred); | ||||||
``` | ||||||
|
||||||
[heading Examples] | ||||||
|
||||||
[heading Iterator Requirements] | ||||||
|
||||||
`indirect_sort` requires random-access iterators. | ||||||
|
||||||
[heading Complexity] | ||||||
|
||||||
Both of the variants of `indirect_sort` run in ['O(N lg N)] time; they are not more (or less) efficient than `std::sort`. There is an extra layer of indirection on each comparison, but all off the swaps are done on values of type `size_t` | ||||||
|
||||||
[heading Exception Safety] | ||||||
|
||||||
[heading Notes] | ||||||
|
||||||
[endsect] | ||||||
|
||||||
[/ File indirect_sort.qbk | ||||||
Copyright 2023 Marshall Clow | ||||||
Distributed under the Boost Software License, Version 1.0. | ||||||
(See accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt). | ||||||
] |
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,83 @@ | ||||||
/* | ||||||
Copyright (c) Marshall Clow 2023. | ||||||
|
||||||
Distributed under the Boost Software License, Version 1.0. (See accompanying | ||||||
file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) | ||||||
|
||||||
*/ | ||||||
|
||||||
/// \file indirect_sort.hpp | ||||||
/// \brief indirect sorting algorithms | ||||||
/// \author Marshall Clow | ||||||
/// | ||||||
|
||||||
#ifndef BOOST_ALGORITHM_IS_INDIRECT_SORT | ||||||
#define BOOST_ALGORITHM_IS_INDIRECT_SORT | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Unusual include guard. Why not |
||||||
|
||||||
#include <algorithm> // for std::sort (and others) | ||||||
#include <functional> // for std::less | ||||||
#include <vector> // for std:;vector | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Typo:
Suggested change
But is that comment really required? |
||||||
|
||||||
#include <boost/algorithm/cxx11/iota.hpp> | ||||||
|
||||||
namespace boost { namespace algorithm { | ||||||
|
||||||
namespace detail { | ||||||
|
||||||
template <class Predicate, class Iter> | ||||||
struct indirect_predicate { | ||||||
indirect_predicate (Predicate pred, Iter iter) | ||||||
: pred_(pred), iter_(iter) {} | ||||||
|
||||||
bool operator ()(size_t a, size_t b) const { | ||||||
return pred_(iter_[a], iter_[b]); | ||||||
} | ||||||
|
||||||
Predicate pred_; | ||||||
Iter iter_; | ||||||
}; | ||||||
|
||||||
} | ||||||
|
||||||
typedef std::vector<size_t> Permutation; | ||||||
|
||||||
// ===== sort ===== | ||||||
|
||||||
/// \fn indirect_sort (RAIterator first, RAIterator last, Predicate p) | ||||||
/// \returns a permutation of the elements in the range [first, last) | ||||||
/// such that when the permutation is applied to the sequence, | ||||||
/// the result is sorted according to the predicate pred. | ||||||
/// | ||||||
/// \param first The start of the input sequence | ||||||
/// \param last The end of the input sequence | ||||||
/// \param pred The predicate to compare elements with | ||||||
/// | ||||||
template <typename RAIterator, typename Pred> | ||||||
std::vector<size_t> indirect_sort (RAIterator first, RAIterator last, Pred pred) { | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
Permutation ret(std::distance(first, last)); | ||||||
boost::algorithm::iota(ret.begin(), ret.end(), size_t(0)); | ||||||
std::sort(ret.begin(), ret.end(), | ||||||
detail::indirect_predicate<Pred, RAIterator>(pred, first)); | ||||||
return ret; | ||||||
} | ||||||
|
||||||
/// \fn indirect_sort (RAIterator first, RAIterator las ) | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
/// \returns a permutation of the elements in the range [first, last) | ||||||
/// such that when the permutation is applied to the sequence, | ||||||
/// the result is sorted according to the predicate pred. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
/// | ||||||
/// \param first The start of the input sequence | ||||||
/// \param last The end of the input sequence | ||||||
/// | ||||||
template <typename RAIterator> | ||||||
std::vector<size_t> indirect_sort (RAIterator first, RAIterator last) { | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
return indirect_sort(first, last, | ||||||
std::less<typename std::iterator_traits<RAIterator>::value_type>()); | ||||||
} | ||||||
|
||||||
// ===== stable_sort ===== | ||||||
// ===== partial_sort ===== | ||||||
// ===== nth_element ===== | ||||||
}} | ||||||
|
||||||
#endif // BOOST_ALGORITHM_IS_INDIRECT_SORT |
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,100 @@ | ||||||
/* | ||||||
Copyright (c) Marshall Clow 2011-2012. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
|
||||||
Distributed under the Boost Software License, Version 1.0. (See accompanying | ||||||
file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) | ||||||
|
||||||
For more information, see http://www.boost.org | ||||||
*/ | ||||||
|
||||||
#include <boost/config.hpp> | ||||||
#include <boost/algorithm/indirect_sort.hpp> | ||||||
#include <boost/algorithm/apply_permutation.hpp> | ||||||
#include <boost/algorithm/cxx11/is_sorted.hpp> | ||||||
|
||||||
#define BOOST_TEST_MAIN | ||||||
#include <boost/test/unit_test.hpp> | ||||||
|
||||||
#include <iostream> | ||||||
#include <string> | ||||||
#include <vector> | ||||||
#include <list> | ||||||
|
||||||
typedef std::vector<size_t> Permutation; | ||||||
|
||||||
// A permutation of size N is a sequence of values in the range [0..N) | ||||||
// such that no value appears more than once in the permutation. | ||||||
bool isa_permutation(Permutation p, size_t N) { | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
is more readable. |
||||||
if (p.size() != N) return false; | ||||||
|
||||||
// Sort the permutation, and ensure that each value appears exactly once. | ||||||
std::sort(p.begin(), p.end()); | ||||||
for (size_t i = 0; i < N; ++i) | ||||||
if (p[i] != i) return false; | ||||||
return true; | ||||||
} | ||||||
|
||||||
template <typename Iter, | ||||||
typename Comp = typename std::less<typename std::iterator_traits<Iter>::value_type> > | ||||||
struct indirect_comp { | ||||||
indirect_comp (Iter it, Comp c = Comp()) | ||||||
: iter_(it), comp_(c) {} | ||||||
|
||||||
bool operator ()(size_t a, size_t b) const { return comp_(iter_[a], iter_[b]);} | ||||||
|
||||||
Iter iter_; | ||||||
Comp comp_; | ||||||
}; | ||||||
|
||||||
template <typename Iter> | ||||||
void test_one_sort(Iter first, Iter last) { | ||||||
Permutation perm = boost::algorithm::indirect_sort(first, last); | ||||||
BOOST_CHECK (isa_permutation(perm, std::distance(first, last))); | ||||||
BOOST_CHECK (boost::algorithm::is_sorted(perm.begin(), perm.end(), indirect_comp<Iter>(first))); | ||||||
|
||||||
// Make a copy of the data, apply the permutation, and ensure that it is sorted. | ||||||
std::vector<typename std::iterator_traits<Iter>::value_type> v(first, last); | ||||||
boost::algorithm::apply_permutation(v.begin(), v.end(), perm.begin(), perm.end()); | ||||||
BOOST_CHECK (boost::algorithm::is_sorted(v.begin(), v.end())); | ||||||
} | ||||||
|
||||||
template <typename Iter, typename Comp> | ||||||
void test_one_sort(Iter first, Iter last, Comp comp) { | ||||||
Permutation perm = boost::algorithm::indirect_sort(first, last, comp); | ||||||
BOOST_CHECK (isa_permutation(perm, std::distance(first, last))); | ||||||
BOOST_CHECK (boost::algorithm::is_sorted(perm.begin(), perm.end(), | ||||||
indirect_comp<Iter, Comp>(first, comp))); | ||||||
|
||||||
// Make a copy of the data, apply the permutation, and ensure that it is sorted. | ||||||
std::vector<typename std::iterator_traits<Iter>::value_type> v(first, last); | ||||||
boost::algorithm::apply_permutation(v.begin(), v.end(), perm.begin(), perm.end()); | ||||||
BOOST_CHECK (boost::algorithm::is_sorted(v.begin(), v.end(), comp)); | ||||||
} | ||||||
|
||||||
|
||||||
void test_sort () { | ||||||
BOOST_CXX14_CONSTEXPR int num[] = { 1,3,5,7,9, 2, 4, 6, 8, 10 }; | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
or |
||||||
const int sz = sizeof (num)/sizeof(num[0]); | ||||||
int *first = &num[0]; | ||||||
int const *cFirst = &num[0]; | ||||||
|
||||||
// Test subsets | ||||||
for (size_t i = 0; i <= sz; ++i) { | ||||||
test_one_sort(first, first + i); | ||||||
test_one_sort(first, first + i, std::greater<int>()); | ||||||
|
||||||
// test with constant inputs | ||||||
test_one_sort(cFirst, cFirst + i); | ||||||
test_one_sort(cFirst, cFirst + i, std::greater<int>()); | ||||||
} | ||||||
|
||||||
// make sure we work with iterators as well as pointers | ||||||
std::vector<int> v(first, first + sz); | ||||||
test_one_sort(v.begin(), v.end()); | ||||||
test_one_sort(v.begin(), v.end(), std::greater<int>()); | ||||||
} | ||||||
|
||||||
BOOST_AUTO_TEST_CASE( test_main ) | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why that extra method and not using There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Because I expect there to be more test cases in the future. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. But the whole idea of -->
|
||||||
{ | ||||||
test_sort (); | ||||||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about this a bit shorter wording especially avoiding to mention the need to sort twice:
Are the double-spaces after each sentence intended?