Description
Describe the bug
The load_valid_labels()
function in yolo/tools/data_loader.py
currently filters coordinates element-wise, as shown below:
valid_points = points[(points >= 0) & (points <= 1)].reshape(-1, 2)
Because some rows can be partially filtered out, this can disrupt the 2D structure of the coordinate pairs.
A more appropriate approach would be:
valid_points = points[np.all((points >= 0) & (points <= 1), axis=1)]
This ensures that each pair of coordinates (rows) is checked together, filtering only those points that satisfy the conditions (coord >= 0
and coord <= 1
) across both axes.
To Reproduce
Consider the following points:
points = np.array([[0.1, 0.1], [1.1, 0.1], [1.1, 0.8], [0.1, 0.8]])
With the current logic:
>>> (points >= 0) & (points <= 1)
array([[ True, True],
[False, True],
[False, True],
[ True, True]])
>>> points[(points >= 0) & (points <= 1)].reshape(-1, 2)
array([[0.1, 0.1],
[0.1, 0.8],
[0.1, 0.8]])
With the suggested fix, the output would be:
>>> np.all((points >= 0) & (points <= 1), axis=1)
array([ True, False, False, True])
>>> points[np.all((points >= 0) & (points <= 1), axis=1)]
array([[0.1, 0.1],
[0.1, 0.8]])
This fix preserves each pair of coordinates, but if a bounding box has points outside [0, 1]
, those points are filtered out. This can lead to incomplete bounding boxes composed only of the remaining points, which no longer meaningfully represent the original bounding box.
Expected behavior
I believe that unless all points lie entirely outside the [0, 1]
, we should preserve the bounding box by clipping any out-of-range coordinates to [0, 1]
. This helps maintain as much of the original bounding box as possible. What are your thoughts?
Screenshots
Visualization bboxes of heads using the CrowdHuman dataset.