Skip to content

Fix overlapping crossbars #87

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 8 commits into
base: master
Choose a base branch
from
Draft

Conversation

pedroilidio
Copy link

Three main points are relevant in this PR:

  1. It fixes Crossbars can overlap in Critical Difference Diagrams #86.

  2. I also modified the sign_array function, since I was again running into #82. The modification dispenses the epsilon added by Update _plotting.py #81 and discussed in #82. Please let me know if this is incorrect. I made sure the copy problem pointed out by #77 remains solved.

  3. To ensure these problems do not come up in the future I added unit tests based on #86 and #82.

@pedroilidio
Copy link
Author

pedroilidio commented Apr 15, 2025

Some further details about the solution. The goal of the function is to plot in different levels the crossbars that intersect with each other. The problem was caused by how we were calculating the intersections between crossbars (merely saying that they intersect if they have a point in common):

if not any(bool(bar & bar_in_lvl) for bar_in_lvl in bars_in_level):

Now, the function considers that two crossbars intersect if their intervals between their minimum and maximum ranks intersect (the ranks are the positions ploted in the x axis). The line below evaluates if two crossbars do not intersect:

(bar_i["max"] < bar_j["min"]) or (bar_i["min"] > bar_j["max"])

Furthermore, I replaced the iterative algorithm to find the crossbar levels:

# Create stacking of crossbars: for each level, try to fit the crossbar,
# so that it does not intersect with any other in the level. If it does not
# fit in any level, create a new level for it.
crossbar_levels: list[list[set]] = []
for bar in crossbar_sets:
for level, bars_in_level in enumerate(crossbar_levels):
if not any(bool(bar & bar_in_lvl) for bar_in_lvl in bars_in_level):
ypos = -level - 1
bars_in_level.append(bar)
break
else:
ypos = -len(crossbar_levels) - 1
crossbar_levels.append([bar])

The function now generates a crossbar-crossbar adjacency matrix called on_same_level and uses the already implemented _find_maximal_cliques function to include as many bars as possible in each level:

# The levels are the maximal cliques of the crossbar adjacency matrix.
crossbar_levels = _find_maximal_cliques(on_same_level)

Given n_bars crossbars, on_same_level is a n_bars by n_bars binary matrix containing 0 if the crossbars overlap and 1 otherwise. 1 thus means the crossbars can be plotted in the same level.

Edits: Adjust lines displayed.

@maximtrp
Copy link
Owner

Thank you! I'm gonna review all the related code in your pull request and this package

@pedroilidio
Copy link
Author

Wait, I've found two other bugs in this implementation. I'll check if they are also in the master branch, open issues if they do, and add the fixes to this PR.

Bug 1: Crossbars are repeated
Bug 2: the points in the crossbar are not being plotted in order, which causes lines to go back and forth sometimes (apparent when linestyle is not contiguous and/or not fully opaque).

test

import numpy as np
import matplotlib.pyplot as plt
from pandas import DataFrame, Series

from scikit_posthocs import critical_difference_diagram

n = 10
rng = np.random.default_rng(0)
rank_values = rng.uniform(size=n)
index = np.arange(n).astype(str)

ranks = Series(rank_values, index=index)

sig_array = np.zeros((n, n), dtype=bool)
values = rng.integers(0, 2, size=(n * (n - 1)) // 2, dtype=bool)

triu = np.triu_indices(n, k=1)
sig_array[triu] = sig_array.T[triu] = values

sig_matrix = DataFrame(sig_array, index=index, columns=index)
output = critical_difference_diagram(
    ranks, sig_matrix, crossbar_props={"marker": "s", "linestyle": ":", "alpha": .5}
)
plt.savefig("test.png")

@maximtrp
Copy link
Owner

@pedroilidio Yeah, thank you! I was about to write that we need as many tests as possible. Can this functionality be covered by them more carefully?

@pedroilidio
Copy link
Author

Indeed, @maximtrp, I'll add as many tests for the CDDs as I can :)

@pedroilidio
Copy link
Author

I apologize, I was quite wrong here:

Furthermore, I replaced the iterative algorithm to find the crossbar levels:
[...]

The maximal cliques approach does not ensure different crossbars at each level. This is more complicated: it's a graph coloring problem in the complement graph of the crossbar graph (on_same_level) (see this SE answer).

I'll revert back to the greedy approach we were using. Another (maybe overkill) option: I could adapt the implementation of NetworkX's approach in a separate module, avoiding to add NetworkX as a new dependency. I could do the same for the maximal cliques function I needed to implement for the CDDs.

@pedroilidio
Copy link
Author

It looks better now, but it suspiciously changed from my previous figure (pairs of points are gone). I will write several tests in a further commit.

fixed

import numpy as np
import matplotlib.pyplot as plt
from pandas import DataFrame, Series

from scikit_posthocs import critical_difference_diagram

n = 10
rng = np.random.default_rng(0)
rank_values = rng.uniform(size=n)
index = np.arange(n).astype(str)
ranks = Series(rank_values, index=index)

triu = np.triu_indices(n, k=1)
values = rng.integers(0, 2, size=len(triu[0]), dtype=np.int8)
sig_array = np.ones((n, n), dtype=np.int8)
sig_array[triu] = sig_array.T[triu] = values

sig_matrix = DataFrame(sig_array, index=index, columns=index)

plt.figure(figsize=(5, 3))
output = critical_difference_diagram(
    ranks, sig_matrix, crossbar_props={"marker": "s", "linestyle": ":", "alpha": .3}
)
plt.tight_layout()
plt.savefig("fixed.png")

@maximtrp
Copy link
Owner

maximtrp commented Apr 18, 2025

Thank you once again! Now not all tests are passed 👇

@pedroilidio
Copy link
Author

Thanks for following this thread, @maximtrp! I'll be busier for some days, so I'll get back to working on this at the end of the next week. For now I'll convert the PR to a draft PR until it's ready for merging. Next steps are:

  • Solve failing tests.
  • Add way more unit tests, e.g.:
    • Assert ranks are ordered in each crossbar (this fixes [CDD] Crossbars are plotted out of order #89).
    • Assert crossbars do not ever overlap.
    • Assert maximal cliques function is properly working (maybe compare several random cases with NetworkX results)
    • etc.
  • Decide: for the greedy placement of crossbars in the levels, should we prioritize wider bars, left-most bars, or maybe a mixture of both? (I'm talking specifically about the following line of code)
    for bar_i in sorted(crossbar_ranks, key=lambda x: x[0] - x[-1]):

@pedroilidio pedroilidio marked this pull request as draft April 18, 2025 11:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Crossbars can overlap in Critical Difference Diagrams
2 participants