Skip to content

Incorrect point filtering logic in load_valid_labels() #156

Open
@ry-immr

Description

@ry-immr

Describe the bug

The load_valid_labels() function in yolo/tools/data_loader.py currently filters coordinates element-wise, as shown below:

valid_points = points[(points >= 0) & (points <= 1)].reshape(-1, 2)

Because some rows can be partially filtered out, this can disrupt the 2D structure of the coordinate pairs.

A more appropriate approach would be:

valid_points = points[np.all((points >= 0) & (points <= 1), axis=1)]

This ensures that each pair of coordinates (rows) is checked together, filtering only those points that satisfy the conditions (coord >= 0 and coord <= 1) across both axes.

To Reproduce

Consider the following points:

points = np.array([[0.1, 0.1], [1.1, 0.1], [1.1, 0.8], [0.1, 0.8]])

With the current logic:

>>> (points >= 0) & (points <= 1)
array([[ True,  True],
       [False,  True],
       [False,  True],
       [ True,  True]])
>>> points[(points >= 0) & (points <= 1)].reshape(-1, 2)
array([[0.1, 0.1],
       [0.1, 0.8],
       [0.1, 0.8]])

With the suggested fix, the output would be:

>>> np.all((points >= 0) & (points <= 1), axis=1)
array([ True, False, False,  True])
>>> points[np.all((points >= 0) & (points <= 1), axis=1)]
array([[0.1, 0.1],
       [0.1, 0.8]])

This fix preserves each pair of coordinates, but if a bounding box has points outside [0, 1], those points are filtered out. This can lead to incomplete bounding boxes composed only of the remaining points, which no longer meaningfully represent the original bounding box.

Expected behavior

I believe that unless all points lie entirely outside the [0, 1], we should preserve the bounding box by clipping any out-of-range coordinates to [0, 1]. This helps maintain as much of the original bounding box as possible. What are your thoughts?

Screenshots

Visualization bboxes of heads using the CrowdHuman dataset.

Current filtering logic:
Image

Fixed filtering logic:
Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions