Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Oct 25, 2025

📄 9% (0.09x) speedup for BoxBounds.divide_by_point in opendm/dem/ground_rectification/bounds/types.py

⏱️ Runtime : 128 microseconds 118 microseconds (best of 517 runs)

📝 Explanation and details

The optimized code applies two key optimizations that result in an 8% speedup:

1. __slots__ Memory Optimization:
Adding __slots__ = ('_corners',) to the BoxBounds class eliminates the per-instance __dict__ that Python normally creates. This reduces memory overhead and makes attribute access slightly faster by using direct slot access instead of dictionary lookups.

2. Computation Hoisting:
The original code repeatedly computes x_point + EPSILON and y_point + EPSILON inline within each BoxBounds constructor call (4 times each). The optimized version pre-computes these values once:

  • x_point_eps = x_point + EPSILON
  • y_point_eps = y_point + EPSILON

This eliminates redundant arithmetic operations - instead of performing 8 additions per method call, it now performs only 2.

Performance Analysis:
The line profiler shows the optimization is most effective for typical use cases where the method is called frequently. Test results show consistent 7-17% improvements across various scenarios, with the best gains (15-17%) occurring in edge cases like corner points and zero-area boxes where the method's overhead is more pronounced relative to the simple operations being performed.

The __slots__ optimization provides consistent memory efficiency benefits, while computation hoisting reduces CPU cycles, making this particularly effective for applications that perform many box subdivisions in geometric algorithms.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 1320 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
import pytest
from opendm.dem.ground_rectification.bounds.types import BoxBounds

# function to test
EPSILON = 0.00001
from opendm.dem.ground_rectification.bounds.types import BoxBounds

# unit tests

# ----------- BASIC TEST CASES -----------

def test_basic_center_division():
    # Divide a 0-10,0-10 box at (5,5)
    box = BoxBounds(0, 10, 0, 10)
    codeflash_output = box.divide_by_point([5,5]); result = codeflash_output # 1.33μs -> 1.19μs (11.7% faster)
    # Check the four sub-boxes
    expected = [
        BoxBounds(0, 5, 0, 5),
        BoxBounds(5 + EPSILON, 10, 0, 5),
        BoxBounds(0, 5, 5 + EPSILON, 10),
        BoxBounds(5 + EPSILON, 10, 5 + EPSILON, 10)
    ]

def test_basic_non_integer_point():
    # Divide at a non-integer point
    box = BoxBounds(-2, 2, -2, 2)
    codeflash_output = box.divide_by_point([0.5, -0.5]); result = codeflash_output # 1.39μs -> 1.24μs (11.4% faster)
    expected = [
        BoxBounds(-2, 0.5, -2, -0.5),
        BoxBounds(0.5 + EPSILON, 2, -2, -0.5),
        BoxBounds(-2, 0.5, -0.5 + EPSILON, 2),
        BoxBounds(0.5 + EPSILON, 2, -0.5 + EPSILON, 2)
    ]

def test_basic_point_at_origin():
    # Divide at (0,0) in a box from -1 to 1
    box = BoxBounds(-1, 1, -1, 1)
    codeflash_output = box.divide_by_point([0,0]); result = codeflash_output # 1.48μs -> 1.29μs (15.0% faster)
    expected = [
        BoxBounds(-1, 0, -1, 0),
        BoxBounds(0 + EPSILON, 1, -1, 0),
        BoxBounds(-1, 0, 0 + EPSILON, 1),
        BoxBounds(0 + EPSILON, 1, 0 + EPSILON, 1)
    ]

# ----------- EDGE TEST CASES -----------

def test_edge_point_on_left_edge():
    # Point on left edge (x_min)
    box = BoxBounds(0, 10, 0, 10)
    point = [0, 5]
    codeflash_output = box.divide_by_point(point); result = codeflash_output # 1.48μs -> 1.29μs (15.0% faster)
    # First and third boxes should have zero width
    expected = [
        BoxBounds(0, 0, 0, 5),
        BoxBounds(0 + EPSILON, 10, 0, 5),
        BoxBounds(0, 0, 5 + EPSILON, 10),
        BoxBounds(0 + EPSILON, 10, 5 + EPSILON, 10)
    ]

def test_edge_point_on_bottom_edge():
    # Point on bottom edge (y_min)
    box = BoxBounds(1, 4, 2, 8)
    point = [2, 2]
    codeflash_output = box.divide_by_point(point); result = codeflash_output # 1.39μs -> 1.29μs (7.34% faster)
    expected = [
        BoxBounds(1, 2, 2, 2),
        BoxBounds(2 + EPSILON, 4, 2, 2),
        BoxBounds(1, 2, 2 + EPSILON, 8),
        BoxBounds(2 + EPSILON, 4, 2 + EPSILON, 8)
    ]

def test_edge_point_on_top_right_corner():
    # Point on top right corner (x_max, y_max)
    box = BoxBounds(-5, 5, -5, 5)
    point = [5, 5]
    codeflash_output = box.divide_by_point(point); result = codeflash_output # 1.38μs -> 1.28μs (7.83% faster)
    expected = [
        BoxBounds(-5, 5, -5, 5),
        BoxBounds(5 + EPSILON, 5, -5, 5),
        BoxBounds(-5, 5, 5 + EPSILON, 5),
        BoxBounds(5 + EPSILON, 5, 5 + EPSILON, 5)
    ]

def test_edge_point_on_min_corner():
    # Point at (x_min, y_min)
    box = BoxBounds(0, 100, 0, 100)
    point = [0, 0]
    codeflash_output = box.divide_by_point(point); result = codeflash_output # 1.46μs -> 1.26μs (16.2% faster)
    expected = [
        BoxBounds(0, 0, 0, 0),
        BoxBounds(0 + EPSILON, 100, 0, 0),
        BoxBounds(0, 0, 0 + EPSILON, 100),
        BoxBounds(0 + EPSILON, 100, 0 + EPSILON, 100)
    ]

def test_edge_point_extremely_close_to_edge():
    # Point extremely close to x_max and y_max
    box = BoxBounds(0, 1, 0, 1)
    point = [1 - 1e-9, 1 - 1e-9]
    codeflash_output = box.divide_by_point(point); result = codeflash_output # 1.37μs -> 1.26μs (8.73% faster)
    expected = [
        BoxBounds(0, 1 - 1e-9, 0, 1 - 1e-9),
        BoxBounds(1 - 1e-9 + EPSILON, 1, 0, 1 - 1e-9),
        BoxBounds(0, 1 - 1e-9, 1 - 1e-9 + EPSILON, 1),
        BoxBounds(1 - 1e-9 + EPSILON, 1, 1 - 1e-9 + EPSILON, 1)
    ]

def test_edge_zero_area_box():
    # Box with zero area (all corners same)
    box = BoxBounds(1, 1, 1, 1)
    point = [1, 1]
    codeflash_output = box.divide_by_point(point); result = codeflash_output # 1.42μs -> 1.21μs (17.0% faster)
    expected = [
        BoxBounds(1, 1, 1, 1),
        BoxBounds(1 + EPSILON, 1, 1, 1),
        BoxBounds(1, 1, 1 + EPSILON, 1),
        BoxBounds(1 + EPSILON, 1, 1 + EPSILON, 1)
    ]

def test_edge_point_not_inside_box():
    # Point outside box: Function assumes point is inside, but let's see behavior
    box = BoxBounds(0, 1, 0, 1)
    point = [2, 2]
    codeflash_output = box.divide_by_point(point); result = codeflash_output # 1.43μs -> 1.33μs (7.29% faster)
    # The function does not check for this, so the result is still four boxes
    expected = [
        BoxBounds(0, 2, 0, 2),
        BoxBounds(2 + EPSILON, 1, 0, 2),
        BoxBounds(0, 2, 2 + EPSILON, 1),
        BoxBounds(2 + EPSILON, 1, 2 + EPSILON, 1)
    ]

def test_edge_point_on_float_precision():
    # Point at a float that could cause precision errors
    box = BoxBounds(0.1, 0.3, 0.2, 0.4)
    point = [0.2, 0.3]
    codeflash_output = box.divide_by_point(point); result = codeflash_output # 1.41μs -> 1.26μs (12.1% faster)
    expected = [
        BoxBounds(0.1, 0.2, 0.2, 0.3),
        BoxBounds(0.2 + EPSILON, 0.3, 0.2, 0.3),
        BoxBounds(0.1, 0.2, 0.3 + EPSILON, 0.4),
        BoxBounds(0.2 + EPSILON, 0.3, 0.3 + EPSILON, 0.4)
    ]

# ----------- LARGE SCALE TEST CASES -----------

def test_large_scale_division_uniform_grid():
    # Divide a large box at a random point, check all sub-boxes are valid and non-overlapping
    box = BoxBounds(0, 1000, 0, 1000)
    point = [500, 500]
    codeflash_output = box.divide_by_point(point); result = codeflash_output # 1.44μs -> 1.25μs (15.4% faster)
    # Check all four boxes are BoxBounds and their corners are as expected
    expected = [
        BoxBounds(0, 500, 0, 500),
        BoxBounds(500 + EPSILON, 1000, 0, 500),
        BoxBounds(0, 500, 500 + EPSILON, 1000),
        BoxBounds(500 + EPSILON, 1000, 500 + EPSILON, 1000)
    ]

def test_large_scale_multiple_divisions():
    # Recursively divide a box several times, and ensure all sub-boxes are inside the original
    box = BoxBounds(0, 100, 0, 100)
    points = [
        [25, 25],
        [12.5, 12.5],
        [6.25, 6.25]
    ]
    # For each division, pick the first sub-box and divide again
    current_box = box
    for pt in points:
        codeflash_output = current_box.divide_by_point(pt); sub_boxes = codeflash_output # 3.40μs -> 2.96μs (14.7% faster)
        current_box = sub_boxes[0]  # Always pick the bottom-left sub-box


def test_large_scale_extreme_values():
    # Test with very large float values
    box = BoxBounds(-1e9, 1e9, -1e9, 1e9)
    point = [1e8, -1e8]
    codeflash_output = box.divide_by_point(point); result = codeflash_output # 1.72μs -> 1.62μs (6.49% faster)
    expected = [
        BoxBounds(-1e9, 1e8, -1e9, -1e8),
        BoxBounds(1e8 + EPSILON, 1e9, -1e9, -1e8),
        BoxBounds(-1e9, 1e8, -1e8 + EPSILON, 1e9),
        BoxBounds(1e8 + EPSILON, 1e9, -1e8 + EPSILON, 1e9)
    ]
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
import pytest
from opendm.dem.ground_rectification.bounds.types import BoxBounds

# function to test
EPSILON = 0.00001
from opendm.dem.ground_rectification.bounds.types import BoxBounds

# unit tests

# ---- Basic Test Cases ----

def test_basic_center_point_division():
    # Divide a 0-10 x/y box at center point (5,5)
    box = BoxBounds(0, 10, 0, 10)
    codeflash_output = box.divide_by_point([5, 5]); result = codeflash_output # 1.52μs -> 1.45μs (4.90% faster)
    # Each box should be correctly split
    expected = [
        BoxBounds(0, 5, 0, 5),
        BoxBounds(5 + EPSILON, 10, 0, 5),
        BoxBounds(0, 5, 5 + EPSILON, 10),
        BoxBounds(5 + EPSILON, 10, 5 + EPSILON, 10)
    ]
    for r, e in zip(result, expected):
        pass

def test_basic_non_center_point():
    # Divide at a non-center point
    box = BoxBounds(0, 10, 0, 10)
    codeflash_output = box.divide_by_point([2, 8]); result = codeflash_output # 1.45μs -> 1.31μs (10.5% faster)
    expected = [
        BoxBounds(0, 2, 0, 8),
        BoxBounds(2 + EPSILON, 10, 0, 8),
        BoxBounds(0, 2, 8 + EPSILON, 10),
        BoxBounds(2 + EPSILON, 10, 8 + EPSILON, 10)
    ]
    for r, e in zip(result, expected):
        pass

def test_basic_negative_bounds():
    # Negative coordinates
    box = BoxBounds(-10, 0, -10, 0)
    codeflash_output = box.divide_by_point([-5, -5]); result = codeflash_output # 1.44μs -> 1.27μs (12.6% faster)
    expected = [
        BoxBounds(-10, -5, -10, -5),
        BoxBounds(-5 + EPSILON, 0, -10, -5),
        BoxBounds(-10, -5, -5 + EPSILON, 0),
        BoxBounds(-5 + EPSILON, 0, -5 + EPSILON, 0)
    ]
    for r, e in zip(result, expected):
        pass

def test_basic_float_bounds():
    # Float coordinates
    box = BoxBounds(0.5, 1.5, 0.5, 1.5)
    codeflash_output = box.divide_by_point([1.0, 1.0]); result = codeflash_output # 1.36μs -> 1.23μs (10.9% faster)
    expected = [
        BoxBounds(0.5, 1.0, 0.5, 1.0),
        BoxBounds(1.0 + EPSILON, 1.5, 0.5, 1.0),
        BoxBounds(0.5, 1.0, 1.0 + EPSILON, 1.5),
        BoxBounds(1.0 + EPSILON, 1.5, 1.0 + EPSILON, 1.5)
    ]
    for r, e in zip(result, expected):
        pass

# ---- Edge Test Cases ----

def test_edge_point_at_min_corner():
    # Point at the lower-left corner
    box = BoxBounds(0, 10, 0, 10)
    codeflash_output = box.divide_by_point([0, 0]); result = codeflash_output # 1.48μs -> 1.33μs (11.7% faster)
    expected = [
        BoxBounds(0, 0, 0, 0),
        BoxBounds(0 + EPSILON, 10, 0, 0),
        BoxBounds(0, 0, 0 + EPSILON, 10),
        BoxBounds(0 + EPSILON, 10, 0 + EPSILON, 10)
    ]
    for r, e in zip(result, expected):
        pass

def test_edge_point_at_max_corner():
    # Point at the upper-right corner
    box = BoxBounds(0, 10, 0, 10)
    codeflash_output = box.divide_by_point([10, 10]); result = codeflash_output # 1.43μs -> 1.26μs (13.6% faster)
    expected = [
        BoxBounds(0, 10, 0, 10),
        BoxBounds(10 + EPSILON, 10, 0, 10),
        BoxBounds(0, 10, 10 + EPSILON, 10),
        BoxBounds(10 + EPSILON, 10, 10 + EPSILON, 10)
    ]
    for r, e in zip(result, expected):
        pass

def test_edge_point_on_x_min():
    # Point on x_min, but not y_min/y_max
    box = BoxBounds(0, 10, 0, 10)
    codeflash_output = box.divide_by_point([0, 5]); result = codeflash_output # 1.54μs -> 1.34μs (15.0% faster)
    expected = [
        BoxBounds(0, 0, 0, 5),
        BoxBounds(0 + EPSILON, 10, 0, 5),
        BoxBounds(0, 0, 5 + EPSILON, 10),
        BoxBounds(0 + EPSILON, 10, 5 + EPSILON, 10)
    ]
    for r, e in zip(result, expected):
        pass

def test_edge_point_on_y_min():
    # Point on y_min, but not x_min/x_max
    box = BoxBounds(0, 10, 0, 10)
    codeflash_output = box.divide_by_point([5, 0]); result = codeflash_output # 1.49μs -> 1.30μs (14.3% faster)
    expected = [
        BoxBounds(0, 5, 0, 0),
        BoxBounds(5 + EPSILON, 10, 0, 0),
        BoxBounds(0, 5, 0 + EPSILON, 10),
        BoxBounds(5 + EPSILON, 10, 0 + EPSILON, 10)
    ]
    for r, e in zip(result, expected):
        pass

def test_edge_point_on_x_max():
    # Point on x_max, but not y_min/y_max
    box = BoxBounds(0, 10, 0, 10)
    codeflash_output = box.divide_by_point([10, 5]); result = codeflash_output # 1.40μs -> 1.25μs (12.0% faster)
    expected = [
        BoxBounds(0, 10, 0, 5),
        BoxBounds(10 + EPSILON, 10, 0, 5),
        BoxBounds(0, 10, 5 + EPSILON, 10),
        BoxBounds(10 + EPSILON, 10, 5 + EPSILON, 10)
    ]
    for r, e in zip(result, expected):
        pass

def test_edge_point_on_y_max():
    # Point on y_max, but not x_min/x_max
    box = BoxBounds(0, 10, 0, 10)
    codeflash_output = box.divide_by_point([5, 10]); result = codeflash_output # 1.41μs -> 1.21μs (16.7% faster)
    expected = [
        BoxBounds(0, 5, 0, 10),
        BoxBounds(5 + EPSILON, 10, 0, 10),
        BoxBounds(0, 5, 10 + EPSILON, 10),
        BoxBounds(5 + EPSILON, 10, 10 + EPSILON, 10)
    ]
    for r, e in zip(result, expected):
        pass

def test_edge_point_on_box_edge():
    # Point on box edge but not at corner
    box = BoxBounds(0, 10, 0, 10)
    codeflash_output = box.divide_by_point([10, 0]); result = codeflash_output # 1.46μs -> 1.27μs (15.5% faster)
    expected = [
        BoxBounds(0, 10, 0, 0),
        BoxBounds(10 + EPSILON, 10, 0, 0),
        BoxBounds(0, 10, 0 + EPSILON, 10),
        BoxBounds(10 + EPSILON, 10, 0 + EPSILON, 10)
    ]
    for r, e in zip(result, expected):
        pass

def test_edge_point_precision():
    # Point very close to the edge, test floating point precision
    box = BoxBounds(0, 1, 0, 1)
    near_edge = EPSILON / 2
    codeflash_output = box.divide_by_point([near_edge, near_edge]); result = codeflash_output # 1.40μs -> 1.23μs (13.6% faster)
    expected = [
        BoxBounds(0, near_edge, 0, near_edge),
        BoxBounds(near_edge + EPSILON, 1, 0, near_edge),
        BoxBounds(0, near_edge, near_edge + EPSILON, 1),
        BoxBounds(near_edge + EPSILON, 1, near_edge + EPSILON, 1)
    ]
    for r, e in zip(result, expected):
        pass

def test_edge_zero_area_box():
    # Zero area box: x_min == x_max, y_min == y_max
    box = BoxBounds(5, 5, 5, 5)
    codeflash_output = box.divide_by_point([5, 5]); result = codeflash_output # 1.46μs -> 1.28μs (14.4% faster)
    expected = [
        BoxBounds(5, 5, 5, 5),
        BoxBounds(5 + EPSILON, 5, 5, 5),
        BoxBounds(5, 5, 5 + EPSILON, 5),
        BoxBounds(5 + EPSILON, 5, 5 + EPSILON, 5)
    ]
    for r, e in zip(result, expected):
        pass

# ---- Large Scale Test Cases ----

def test_large_scale_division_uniform():
    # Divide a large box at many points, check for correctness
    box = BoxBounds(0, 999, 0, 999)
    step = 100
    for x in range(0, 999, step):
        for y in range(0, 999, step):
            codeflash_output = box.divide_by_point([x, y]); result = codeflash_output
            expected = [
                BoxBounds(0, x, 0, y),
                BoxBounds(x + EPSILON, 999, 0, y),
                BoxBounds(0, x, y + EPSILON, 999),
                BoxBounds(x + EPSILON, 999, y + EPSILON, 999)
            ]
            for r, e in zip(result, expected):
                pass

def test_large_scale_random_points():
    # Divide at random points inside a large box
    import random
    box = BoxBounds(0, 999, 0, 999)
    random.seed(42)
    for _ in range(10):  # 10 random points
        x = random.randint(0, 998)
        y = random.randint(0, 998)
        codeflash_output = box.divide_by_point([x, y]); result = codeflash_output # 8.43μs -> 7.78μs (8.38% faster)
        expected = [
            BoxBounds(0, x, 0, y),
            BoxBounds(x + EPSILON, 999, 0, y),
            BoxBounds(0, x, y + EPSILON, 999),
            BoxBounds(x + EPSILON, 999, y + EPSILON, 999)
        ]
        for r, e in zip(result, expected):
            pass

def test_large_scale_float_precision():
    # Divide a box with very large float values
    box = BoxBounds(1e6, 1e6 + 999, 2e6, 2e6 + 999)
    point = [1e6 + 500.5, 2e6 + 250.25]
    codeflash_output = box.divide_by_point(point); result = codeflash_output # 1.21μs -> 1.12μs (7.38% faster)
    expected = [
        BoxBounds(1e6, 1e6 + 500.5, 2e6, 2e6 + 250.25),
        BoxBounds(1e6 + 500.5 + EPSILON, 1e6 + 999, 2e6, 2e6 + 250.25),
        BoxBounds(1e6, 1e6 + 500.5, 2e6 + 250.25 + EPSILON, 2e6 + 999),
        BoxBounds(1e6 + 500.5 + EPSILON, 1e6 + 999, 2e6 + 250.25 + EPSILON, 2e6 + 999)
    ]
    for r, e in zip(result, expected):
        pass

def test_large_scale_box_size_one():
    # Box of size one, divide at all possible integer points
    box = BoxBounds(0, 1, 0, 1)
    for x in range(0, 2):
        for y in range(0, 2):
            codeflash_output = box.divide_by_point([x, y]); result = codeflash_output
            expected = [
                BoxBounds(0, x, 0, y),
                BoxBounds(x + EPSILON, 1, 0, y),
                BoxBounds(0, x, y + EPSILON, 1),
                BoxBounds(x + EPSILON, 1, y + EPSILON, 1)
            ]
            for r, e in zip(result, expected):
                pass
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-BoxBounds.divide_by_point-mh5pgkzo and push.

Codeflash

The optimized code applies two key optimizations that result in an 8% speedup:

**1. `__slots__` Memory Optimization:**
Adding `__slots__ = ('_corners',)` to the BoxBounds class eliminates the per-instance `__dict__` that Python normally creates. This reduces memory overhead and makes attribute access slightly faster by using direct slot access instead of dictionary lookups.

**2. Computation Hoisting:**
The original code repeatedly computes `x_point + EPSILON` and `y_point + EPSILON` inline within each BoxBounds constructor call (4 times each). The optimized version pre-computes these values once:
- `x_point_eps = x_point + EPSILON`
- `y_point_eps = y_point + EPSILON`

This eliminates redundant arithmetic operations - instead of performing 8 additions per method call, it now performs only 2.

**Performance Analysis:**
The line profiler shows the optimization is most effective for typical use cases where the method is called frequently. Test results show consistent 7-17% improvements across various scenarios, with the best gains (15-17%) occurring in edge cases like corner points and zero-area boxes where the method's overhead is more pronounced relative to the simple operations being performed.

The `__slots__` optimization provides consistent memory efficiency benefits, while computation hoisting reduces CPU cycles, making this particularly effective for applications that perform many box subdivisions in geometric algorithms.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 October 25, 2025 03:13
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: Medium Optimization Quality according to Codeflash labels Oct 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: Medium Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants