Skip to content
Merged
Show file tree
Hide file tree
Changes from 9 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file removed doc/Identifiers.docx
Binary file not shown.
184 changes: 184 additions & 0 deletions doc/algorithms.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,184 @@
# Algorithms

## Definitions

A graph path _p_ is a possibly empty sequence of graph edges (_e_<sub>0</sub>, _e_<sub>1</sub>, ..., _e_<sub>_N_</sub>) where:
* _e_<sub>_i_</sub> ≠ _e_<sub>_j_</sub> for _i_ ≠ _j_,
* target(_e_<sub>_i_</sub>) = source(_e_<sub>_i_+1</sub>),
* source(_e_<sub>_i_</sub>) != source(_e_<sub>_j_</sub>) for _i_ ≠ _j_.

<code><i>path-source</i>(<i>p</i>)</code> = source(_e_<sub>0</sub>). <code><i>path-target</i>(<i>p</i>)</code> = target(_e_<sub>_N_</sub>).

<code><i>distance(p)</i></code> is a sum over _i_ of <code>weight</code>(_e_<sub>_i_</sub>).

<code><i>shortest-path</i>(g, u, v)</code> is a path in the set of all paths `p` in graph `g` with <code><i>path-source</i>(<i>p</i>)</code> = `u`
and <code><i>path-target</i>(<i>p</i>)</code> = v that has the smallest value of <code><i>distance(p)</i></code>.

<code><i>shortest-path-distance</i>(g, u, v)</code> is <code><i>distance</i>(<i>shortest-path</i>(g, u, v))</code> if it exists and _infinite-distance_ otherwise.

<code><i>shortest-path-predecessor</i>(g, u, v)</code>, in the set of all shortest paths <code><i>shortest-path</i>(g, u, v)</code> for any `v`:
* if there exists an edge _e_ with target(_e_) = v, then it is source(_e_),
* otherwise it is `v`.

## Visitors

A number of functions in this section take a _visitor_ as an optional argument.
As different _events_, related to vertices and edges, occur during the execution of an algorithm,
a corresponding member function, if present, is called for the visitor.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some points that might useful to people.

  1. No runtime overhead occurs if the visitor function isn't defined on the visitor class.
  2. The visitor functions are the same used by boost::graph.
  3. Each algorithm defines the visitor functions they support. Additional functions included in the visitor that aren't supported are ignored.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applied #1 and #3. Not sure about #2. First, strictly speaking this cannot be true. The Boost.Graph names are without the on_ prefix. Second, I wouldn't like to give an impression that one needs to know Boost.Graph to understand this library.

### <code><em>GraphVisitor</em></code> requirements

The following lists the visitation events and the corresponding visitor member functions.
For each of the events the visitor may choose to support it via making the corresponding member
function valid.

The notation used:

| name | type | definition |
|-------|------|-------------|
| `vis` | | the visitor |
| `G` | | the type of the graph that the algorithm is instantiated for |
| `vd` | `vertex_info<vertex_id_t<G>, vertex_reference_t<G>, void>` | visited vertex |
| `ed` | `edge_info<vertex_id_t<G>, true, edge_reference_t<G>, void>` | visited edge |

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing visitor functions:

  • on_initialize_vertex(vd)
  • on_examine_vertex(vd)
  • on_tree_edge(ed)
  • on_back_edge(ed)
  • on_forward_or_cross_edge(ed)
  • on_finish_edge(ed)
    See wg21.link/P3128 for descriptions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to stick to the approach where the documentation describes what is present in the library rather than what is present in the paper.

  • on_initialize_vertex(vd) -- I will document it, but first I would like to understand what this one is for: Use cases for visitors #157
  • on_examine_vertex(vd) -- it is documented
  • Regarding the last four, these names do not occur even once in the graph-v2 library, at least in branch master. I would rather describe them when algorithm depth_first_search is implemented.

```c++
vis.on_discover_vertex(vd)
```

If valid, it is called whenever a new vertex is identified for future examination in the
course of executing an algorithm.

(Note: the vertices provided as _seeds_ to algorithms are initially discovered.)

```c++
vis.on_examine_vertex(vd)
```

If valid, it is called whenever a previously discovered vertex is started being examined.

(Note: examining a vertex usually triggers the discovery of other vertices and edges.)

```c++
vis.on_finish_vertex(vd)
```

If valid, it is called whenever an algorithm finishes examining the vertex.

(Note: If the graph is unbalanced and another path to this vertex has a lower accumulated
weight, the algorithm will process `vd` again.
A consequence is that `on_examine_vertex` could be called twice (or more) on the
same vertex.)

```c++
vis.on_examine_edge(ed)
```

If valid, it is called whenever a new edge is started being examined.




```c++
vis.on_edge_relaxed(ed)
```

If valid, it is called whenever an edge is _relaxed_. Relaxing an edge means reducing
the stored minimum accumulated distance found so far from the given seed to the target
of the examined edge `ed`.


```c++
vis.on_edge_not_relaxed(ed)
```

If valid, it is called whenever a new edge `ed` is inspected but not relaxed (because
the stored accumulated distance to the target of `ed` found so far is smaller than the path via `ed`.)

```c++
vis.on_edge_minimized(ed)
```

If valid, it is called when no cycles have been detected while examining the edge `ed`.


```c++
vis.on_edge_not_minimized(ed)
```

If valid, it is called when a cycles have been detected while examining the edge `ed`.
This happens in shortest paths algorithms that accept negative weights, and means that
no finite minimum exists.


## `dijkstra_shortest_paths`

The shortest paths algorithm builds on the idea that each edge in a graph has its associated _weight_.
A path _distance_ is determined by the composition of weights of edges that constitute the path.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A suggestion:
A path distance is the sum of the edge weights of the path.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am reluctant to use word "sum" due to this example: #169.

By default the composition of the edge weights is summation and the default weight is 1,
so the path distance is the number of edges that it comprises.

Dijkstra's shortest paths algorithm also makes an assumption that appending an edge to a path _increases_
the path's distance. In terms of the default composition and weight this assumption is expressed as `weight(uv) >= 0`.

The distances of each path are returned directly vie the output function argument.
The paths themselves, if requested, are only returned indirectly by providing for each vertex
its predecessor in any shortest path.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested wording for clarity:
The distances of each edge are returned directly via the output function argument. The paths themselves are returned indirectly in the predecessors vector, where each element is the preceding vertex for the matching vertex.
(My wording still feels a little awkward; just giving ideas)


### The single source version

Header `<graph/algorithm/dijkstra_shortest_paths.hpp>`

```c++
template <index_adjacency_list G,
std::ranges::random_access_range Distances,
std::ranges::random_access_range Predecessors,
class WF = function<std::ranges::range_value_t<Distances>(edge_reference_t<G>)>,
class Visitor = empty_visitor,
class Compare = less<std::ranges::range_value_t<Distances>>,
class Combine = plus<std::ranges::range_value_t<Distances>>>
requires std::is_arithmetic_v<std::ranges::range_value_t<Distances>> &&
std::ranges::sized_range<Distances> &&
std::ranges::sized_range<Predecessors> &&
convertible_to<vertex_id_t<G>, std::ranges::range_value_t<Predecessors>> &&
basic_edge_weight_function<G, WF, std::ranges::range_value_t<Distances>, Compare, Combine>
constexpr void dijkstra_shortest_distances(
G&& g,
vertex_id_t<G> source,
Distances& distances,
Predecessors& predecessor,
WF&& weight = [](edge_reference_t<G> uv) { return std::ranges::range_value_t<Distances>(1); },
Visitor&& visitor = empty_visitor(),
Compare&& compare = less<std::ranges::range_value_t<Distances>>(),
Combine&& combine = plus<std::ranges::range_value_t<Distances>>());
```

*Preconditions:*
* <code>distances[<i>i</i>] == shortest_path_infinite_distance&lt;range_value_t&lt;Distances&gt;&gt;()</code> for each <code><i>i</i></code> in range [`0`; `num_vertices(g)`),
* <code>predecessor[<i>i</i>] == <i>i</i></code> for each <code><i>i</i></code> in range [`0`; `num_vertices(g)`),
* `weight` returns non-negative values.
* `visitor` adheres to the _GraphVisitor_ requirements.

*Hardened preconditions:*
* `0 <= source && source < num_vertices(g)` is `true`,
* `std::size(distances) >= num_vertices(g)` is `true`,
* `std::size(predecessor) >= num_vertices(g)` is `true`.

*Effects:* Supports the following visitation events: `on_initialize_vertex`, `on_discover_vertex`,
`on_examine_vertex`, `on_finish_vertex`, `on_examine_edge`, `on_edge_relaxed`, and `on_edge_not_relaxed`.

*Postconditions:* For each <code><i>i</i></code> in range [`0`; `num_vertices(g)`):
* <code>distances[<i>i</i>]</code> is <code><i>shortest-path-distance</i>(g, source, <i>i</i>)</code>,
* <code>predecessor[<i>i</i>]</code> is <code><i>shortest-path-predecessor</i>(g, source, <i>i</i>)</code>.

*Throws:* `std::bad_alloc` if memory for the internal data structures cannot be allocated.

*Complexity:* Either 𝒪((|_E_| + |_V_|)⋅log |_V_|) or 𝒪(|_E_| + |_V_|⋅log |_V_|), depending on the implementation.

*Remarks:* Duplicate sources do not affect the algorithm’s complexity or correctness.


## TODO

Document all other algorithms...
117 changes: 117 additions & 0 deletions doc/customization_points.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
# Customization Points

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it interesting that you label this as Customization Points.

In contrast, I've labeled it as the Graph Container Interface (GCI) because I want the user to focus on the fact that all the free functions are used to describe the interface for a Graph Container. I use "Container" to associate & distinguish it from the standard STL Containers. It is a range-of-ranges container with unique properties.

To me, Customization Points are the means to achieve the functionality I want to be able to support, namely the ability to define and override the functions for a specific graph data structure.

Toward that end, I will describe the functions in the GCI and state that they are Customization Points to help those in the know what's going on, but after that I can just tell people to define the function for their graph data structure and it all just works. If I try to go beyond that then I risk confusing the uninitiated to CPOs.

Beyond that, I also say there are reasonable defaults for all the GCI functions and give examples like vector<vector> and vector<vector<pair<int, double>>> that works by default. There are likely gaps in what I've done. It hasn't been scrutinized very much.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I will try to separate the docs clearly into the narrative part and the reference section, and use the term Graph Container Interface in the former.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expanded the docs structure. This should address most of your remarks.

The algorithms and views in this library operate on graph representations via _Customization Point Objects_ (CPO).
A user-defined graph representation `G` is adapted for use with this library by making sure that the necessary CPOs are _valid_ for `G`.

A CPO is a function object, so it can be passed as an argument to functions.

Each customization point specifies individually what it takes to make it valid.
A customization point can be made valid in a number of ways.
For each customization point we provide an ordered list of ways in which it can be made valid.
The order in this list matters: the match for validity is performed in order,
and if a given customization is determined to be valid, the subsequent ways, even if they would be valid, are ignored.

Often, the last item from the list serves the purpose of a "fallback" or "default" customization.

If none of the customization ways is valid for a given type, or set of types, the customization point is considered _invalid_ for this set of types.
The property or being valid or invalid can be statically tested in the program via SFINAE (like `enable_if`) tricks or `requires`-expressions.

All the customization points in this library are defined in namespace `::graph` and brought into the program code via including header `<graph/graph.hpp>`.


## The list of customization points

We use the following notation to represent the customization points:


| Symbol | Type | Meaning |
|--------|--------------------------------|------------------------------------------|
| `G` | | the type of the graph representation |
| `g` | `G` | the graph representation |
| `u` | `graph::vertex_reference_t<G>` | vertex in `g` |
| `ui` | `graph::vertex_iterator_t<G>` | iterator to a vertex in `g` |
| `uid` | `graph::vertex_id_t<G>` | _id_ of a vertex in `g` (often an index) |
| `uv` | `graph::edge_reference_t<G>` | an edge in `g` |


### `vertices`

The CPO `vertices(g)` is used to obtain the list of all vertices, in form of a `std::ranges::random_access_range`, from the graph-representing object `g`.
We also use its return type to determine the type of the vertex: `vertex_t<G>`.

#### Customization

1. Returns `g.vertices()`, if such member function exists and returns a `std::move_constructible` type.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move_constructible implies that ownership of the internal vertices would be moved to the caller, which is not what we want. It should be returning a random_access_range (or future, bidirectional_range) either as a reference or to something like a subrange that is returned as a value (not a reference).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

struct X { X(X&&) = delete;};
static_assert(std::move_constructible<X&>);

Returning a reference satisfies the std::move_constructible requirements.

2. Returns `vertices(g)`, if such function is ADL-discoverable and returns a `std::move_constructible` type.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as previous.

3. Returns `g`, if it is a `std::ranges::random_access_range`.


### `vertex_id`

The CPO `vertex_id(g, ui)` is used obtain the _id_ of the vertex, given the iterator.
We also use its return type to determine the type of the vertex id: `vertex_id_t<G>`.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine for now. With the addition of descriptors, this is the only function signature that changes. Once that's available, ui becomes u (descriptor).

#### Coustomization

1. Returns `ui->vertex_id(g)`, if this expression is valid and its type is `std::move_constructible`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this will remain valid when descriptors are used. It means that vertex_id(g) would need to be defined on the descriptor, which may be possible.

2. Returns `vertex_id(g, ui)`, if this expression is valid and its type is `std::move_constructible`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the return type is simple like an integral id then this is fine. If we want to extend the id type to be more than integral, like a user-defined type or a string, then we won't want to require move_constructible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that my goal is to document what is in the current library (modulo bugs), rather than a vision for the future.

After a bit of investigation, I conclude that the current Graph Container Interface requires the IDs to be copy_constructible!

Suppose that I have my own graph representation that needs to use non-copyable, non-movable IDs. How am I supposed to customize vertex_id?

ID const& vertex_id(Graph const& g, Graph::const_iterator it) { 
  return it->first;
}

Shall I return by value or by reference to const? If by value, then I need to copy. If by reference, then the CPO vertex_id will do the copying, because its return type is non-reference, and I am returning a reference to const, so even move will not work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Filed #174.

3. Returns <code>static_cast&lt;<em>vertex-id-t</em>&lt;G&gt;&gt;(ui - begin(vertices(g)))</code>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vertex_id_(g,u) is used to define vertex_id_t, so there may be a circular definition here (I hope that's not because of the current definition in graph-v2).

I think this needs to be flushed out a little more to define the vertex type based on vertex_t and whether it is a range or not (for #1 & #2 below).

if `std::ranges::random_access_range<vertex_range_t<G>>` is `true`, where <code><em>vertex-id-t</em></code> is defined as:

* `I`, when the type of `G` matches pattern `ranges::forward_list<ranges::forward_list<I>>` and `I` is `std::integral`,
* `I0`, when the type of `G` matches pattern <code>ranges::forward_list&lt;ranges::forward_list&lt;<em>tuple-like</em>&lt;I0, ...&gt;&gt;&gt;</code> and `I0` is `std::integral`,
* `std::size_t` otherwise.


### `find_vertex`

TODO `find_vertex(g, uid)`

### `edges(g, u)`

### `edges(g, uid)`

### `num_edges(g)`

### `target_id(g, uv)`

### `target_id(e)`

### `source_id(g, uv)`

### `source_id(e)`

### `target(g, uv)`

### `source(g, uv)`

### `find_vertex_edge(g, u, vid)`

### `find_vertex_edge(g, uid, vid)`

### `contains_edge(g, uid, vid)`

### `partition_id(g, u)`

### `partition_id(g, uid)`

### `num_vertices(g, pid)`

### `num_vertices(g)`

### `degree(g, u)`

### `degree(g, uid)`

### `vertex_value(g, u)`

### `edge_value(g, uv)`

### `edge_value(e)`

### `graph_value(g)`

### `num_partitions(g)`

### `has_edge(g)`

Loading
Loading