Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Oct 24, 2025

📄 56% (0.56x) speedup for GlobalMercator.GoogleTile in opendm/tiles/gdal2tiles.py

⏱️ Runtime : 564 microseconds 362 microseconds (best of 200 runs)

📝 Explanation and details

The optimization introduces caching for power-of-2 calculations that eliminates redundant exponential operations in the frequently called GoogleTile method.

Key optimization applied:

  • Added _power_of_2_cache dictionary to store pre-calculated 2**zoom - 1 values
  • Cache lookup (zoom in self._power_of_2_cache) before expensive exponentiation
  • Only calculates 2**zoom - 1 once per unique zoom level, then reuses cached result

Why this leads to speedup:

  • The original code computed 2**zoom - 1 on every single call (1626 hits taking 801,830ns total)
  • Exponentiation is computationally expensive, especially for larger zoom values
  • The optimized version shows cache hits (1584 times) vastly outperform cache misses (42 times)
  • Cache lookups are O(1) dictionary operations vs O(log n) exponentiation complexity

Performance characteristics from tests:

  • First calls to new zoom levels are slower (30-40% overhead) due to cache population
  • Subsequent calls to same zoom levels are significantly faster (50-70% speedup)
  • Large-scale batch operations show major gains (57-69% faster) when processing many tiles at the same zoom level
  • Optimization shines for repeated zoom usage, which is typical in tile generation workflows where entire zoom levels are processed sequentially

This caching strategy is particularly effective for tile map generation where applications typically process all tiles at a specific zoom level before moving to the next level.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 1664 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
import pytest  # used for our unit tests
from opendm.tiles.gdal2tiles import GlobalMercator

# unit tests

# BASIC TEST CASES
def test_basic_origin_tile_zoom_0():
    # At zoom level 0, only one tile exists: (0,0)
    gm = GlobalMercator()
    codeflash_output = gm.GoogleTile(0, 0, 0) # 553ns -> 844ns (34.5% slower)

def test_basic_corners_zoom_1():
    # At zoom level 1, tiles are (0,0), (0,1), (1,0), (1,1)
    gm = GlobalMercator()
    # TMS (0,0) -> Google (0,1)
    codeflash_output = gm.GoogleTile(0, 0, 1) # 702ns -> 1.00μs (29.9% slower)
    # TMS (0,1) -> Google (0,0)
    codeflash_output = gm.GoogleTile(0, 1, 1) # 469ns -> 462ns (1.52% faster)
    # TMS (1,0) -> Google (1,1)
    codeflash_output = gm.GoogleTile(1, 0, 1) # 345ns -> 256ns (34.8% faster)
    # TMS (1,1) -> Google (1,0)
    codeflash_output = gm.GoogleTile(1, 1, 1) # 339ns -> 224ns (51.3% faster)

def test_basic_middle_tile_zoom_2():
    # At zoom level 2, check a middle tile
    gm = GlobalMercator()
    # TMS (1,2) -> Google (1,1)
    codeflash_output = gm.GoogleTile(1, 2, 2) # 668ns -> 986ns (32.3% slower)

def test_basic_various_tiles_zoom_3():
    gm = GlobalMercator()
    # TMS (3,7) at zoom 3: Google (3,0)
    codeflash_output = gm.GoogleTile(3, 7, 3) # 664ns -> 969ns (31.5% slower)
    # TMS (2,5) at zoom 3: Google (2,2)
    codeflash_output = gm.GoogleTile(2, 5, 3) # 429ns -> 492ns (12.8% slower)

# EDGE TEST CASES
def test_edge_negative_tile_indices():
    # Negative tile indices are not standard, but should be handled consistently
    gm = GlobalMercator()
    # TMS (-1,0) at zoom 1: Google (-1,1)
    codeflash_output = gm.GoogleTile(-1, 0, 1) # 668ns -> 980ns (31.8% slower)
    # TMS (0,-1) at zoom 1: Google (0,2)
    codeflash_output = gm.GoogleTile(0, -1, 1) # 431ns -> 459ns (6.10% slower)

def test_edge_large_zoom():
    # Large zoom, check correctness of calculation
    gm = GlobalMercator()
    zoom = 20
    # At zoom 20, max tile index is 2**20-1 = 1048575
    # TMS (1048575, 0) -> Google (1048575, 1048575)
    codeflash_output = gm.GoogleTile(1048575, 0, 20) # 899ns -> 1.12μs (19.6% slower)
    # TMS (0, 1048575) -> Google (0, 0)
    codeflash_output = gm.GoogleTile(0, 1048575, 20) # 495ns -> 460ns (7.61% faster)
    # TMS (524288, 524288) -> Google (524288, 524287)
    codeflash_output = gm.GoogleTile(524288, 524288, 20) # 380ns -> 247ns (53.8% faster)

def test_edge_zero_zoom_various_tiles():
    # At zoom 0, only (0,0) is valid, but test other values
    gm = GlobalMercator()
    # TMS (1,0) at zoom 0: Google (1,0)
    codeflash_output = gm.GoogleTile(1, 0, 0) # 526ns -> 791ns (33.5% slower)
    # TMS (0,1) at zoom 0: Google (0,-1)
    codeflash_output = gm.GoogleTile(0, 1, 0) # 300ns -> 475ns (36.8% slower)

def test_edge_non_integer_zoom():
    # Non-integer zooms are not standard, but function should handle them as per Python's rules
    gm = GlobalMercator()
    # zoom=1.5, TMS (2,2): Google (2, 2**1.5 - 1 - 2)
    expected_y = (2**1.5 - 1) - 2
    codeflash_output = gm.GoogleTile(2, 2, 1.5); result = codeflash_output # 1.20μs -> 1.75μs (31.5% slower)

def test_edge_extreme_tile_indices():
    # Very large or very small tile indices
    gm = GlobalMercator()
    # TMS (999,999) at zoom 10
    codeflash_output = gm.GoogleTile(999, 999, 10) # 738ns -> 1.09μs (32.0% slower)
    # TMS (-999,-999) at zoom 10
    codeflash_output = gm.GoogleTile(-999, -999, 10) # 489ns -> 500ns (2.20% slower)

def test_edge_tile_index_overflow():
    # Tile index larger than max tile index for zoom
    gm = GlobalMercator()
    # At zoom 2, max tile index is 3, test (5,5)
    codeflash_output = gm.GoogleTile(5, 5, 2) # 637ns -> 961ns (33.7% slower)

def test_edge_float_tile_indices():
    # Float tile indices should be handled as per Python's rules
    gm = GlobalMercator()
    # TMS (1.5, 2.5) at zoom 2
    codeflash_output = gm.GoogleTile(1.5, 2.5, 2) # 783ns -> 1.07μs (26.6% slower)


def test_large_scale_sequential_tiles_zoom_8():
    # Test a range of tiles at zoom 8 (256x256 tiles)
    gm = GlobalMercator()
    zoom = 8
    max_idx = 2**zoom - 1
    # Check 100 tiles along the diagonal
    for i in range(100):
        tx = i
        ty = i
        expected = (tx, max_idx - ty)
        codeflash_output = gm.GoogleTile(tx, ty, zoom) # 34.3μs -> 21.7μs (57.7% faster)

def test_large_scale_all_tiles_zoom_5():
    # Test all tiles at zoom 5 (32x32 tiles), but only for first 32 tiles
    gm = GlobalMercator()
    zoom = 5
    max_idx = 2**zoom - 1
    for tx in range(32):
        for ty in range(32):
            expected = (tx, max_idx - ty)
            codeflash_output = gm.GoogleTile(tx, ty, zoom)

def test_large_scale_random_tiles_zoom_10():
    # Test 100 random tiles at zoom 10
    import random
    gm = GlobalMercator()
    zoom = 10
    max_idx = 2**zoom - 1
    random.seed(42)  # deterministic
    for _ in range(100):
        tx = random.randint(0, max_idx)
        ty = random.randint(0, max_idx)
        expected = (tx, max_idx - ty)
        codeflash_output = gm.GoogleTile(tx, ty, zoom) # 36.6μs -> 22.7μs (61.5% faster)

def test_large_scale_extreme_zoom_and_indices():
    # Test the largest allowed zoom and tile indices under 1000
    gm = GlobalMercator()
    zoom = 10  # 1024 tiles
    max_idx = 2**zoom - 1
    # Test first and last 10 tiles
    for tx in range(10):
        for ty in range(10):
            expected = (tx, max_idx - ty)
            codeflash_output = gm.GoogleTile(tx, ty, zoom)
            expected = (max_idx-tx, max_idx-ty)
            codeflash_output = gm.GoogleTile(max_idx-tx, max_idx-ty, zoom)

def test_large_scale_tile_size_variation():
    # Test that tileSize does not affect GoogleTile output
    for ts in [128, 256, 512, 1024]:
        gm = GlobalMercator(tileSize=ts)
        codeflash_output = gm.GoogleTile(5, 5, 3) # 1.84μs -> 2.39μs (23.0% slower)
        codeflash_output = gm.GoogleTile(0, 0, 0)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
import pytest  # used for our unit tests
from opendm.tiles.gdal2tiles import GlobalMercator

# unit tests

@pytest.fixture
def mercator():
    # Fixture to create a default GlobalMercator instance
    return GlobalMercator()

# ------------------------------
# Basic Test Cases
# ------------------------------

def test_google_tile_basic_zero_zero_zoom0(mercator):
    # At zoom level 0, there is only one tile: (0,0)
    codeflash_output = mercator.GoogleTile(0, 0, 0) # 537ns -> 872ns (38.4% slower)

def test_google_tile_basic_zoom1_tiles(mercator):
    # At zoom level 1, tiles are (0,0), (0,1), (1,0), (1,1)
    # TMS (0,0) -> Google (0,1)
    codeflash_output = mercator.GoogleTile(0, 0, 1) # 685ns -> 1.02μs (32.6% slower)
    # TMS (0,1) -> Google (0,0)
    codeflash_output = mercator.GoogleTile(0, 1, 1) # 416ns -> 425ns (2.12% slower)
    # TMS (1,0) -> Google (1,1)
    codeflash_output = mercator.GoogleTile(1, 0, 1) # 359ns -> 261ns (37.5% faster)
    # TMS (1,1) -> Google (1,0)
    codeflash_output = mercator.GoogleTile(1, 1, 1) # 342ns -> 219ns (56.2% faster)

def test_google_tile_basic_zoom2_corners(mercator):
    # At zoom level 2, there are 4x4 tiles
    # Lower-left corner TMS (0,0) -> Google (0,3)
    codeflash_output = mercator.GoogleTile(0, 0, 2) # 672ns -> 1.05μs (36.1% slower)
    # Upper-right corner TMS (3,3) -> Google (3,0)
    codeflash_output = mercator.GoogleTile(3, 3, 2) # 465ns -> 467ns (0.428% slower)
    # Center TMS (1,2) -> Google (1,1)
    codeflash_output = mercator.GoogleTile(1, 2, 2) # 354ns -> 212ns (67.0% faster)

def test_google_tile_basic_typical_middle_tile(mercator):
    # Test a typical middle tile at zoom level 3
    codeflash_output = mercator.GoogleTile(4, 2, 3) # 655ns -> 966ns (32.2% slower)

# ------------------------------
# Edge Test Cases
# ------------------------------

def test_google_tile_edge_max_ty(mercator):
    # At zoom level 4, max ty is 2^4 - 1 = 15
    # TMS (0,15) -> Google (0,0)
    codeflash_output = mercator.GoogleTile(0, 15, 4) # 681ns -> 988ns (31.1% slower)

def test_google_tile_edge_min_ty(mercator):
    # At zoom level 4, min ty is 0
    # TMS (0,0) -> Google (0,15)
    codeflash_output = mercator.GoogleTile(0, 0, 4) # 699ns -> 954ns (26.7% slower)

def test_google_tile_edge_max_tx(mercator):
    # At zoom level 4, max tx is 15
    # TMS (15,0) -> Google (15,15)
    codeflash_output = mercator.GoogleTile(15, 0, 4) # 691ns -> 1.03μs (33.0% slower)
    # TMS (15,15) -> Google (15,0)
    codeflash_output = mercator.GoogleTile(15, 15, 4) # 442ns -> 462ns (4.33% slower)

def test_google_tile_edge_negative_tx_ty(mercator):
    # Negative tile indices: should not crash, but will return negative tx, possibly negative ty
    # This is not a valid tile, but we want to ensure the function is robust
    codeflash_output = mercator.GoogleTile(-1, -1, 2) # 684ns -> 1.03μs (33.6% slower)

def test_google_tile_edge_ty_beyond_range(mercator):
    # ty beyond valid range: should still compute, but not a valid tile
    # For zoom=2, ty=5, valid ty is 0..3, but let's test ty=5
    codeflash_output = mercator.GoogleTile(1, 5, 2) # 680ns -> 977ns (30.4% slower)

def test_google_tile_edge_large_zoom(mercator):
    # Large zoom level, check for correct calculation
    zoom = 10
    tx, ty = 1023, 1023  # max tile indices at zoom 10
    codeflash_output = mercator.GoogleTile(tx, ty, zoom) # 810ns -> 1.12μs (27.9% slower)
    # Lower left corner
    codeflash_output = mercator.GoogleTile(0, 0, zoom) # 498ns -> 456ns (9.21% faster)

def test_google_tile_edge_zero_zoom_nonzero_tx_ty(mercator):
    # At zoom=0, only tx=0, ty=0 is valid, but test others
    codeflash_output = mercator.GoogleTile(1, 1, 0) # 538ns -> 848ns (36.6% slower)

# ------------------------------
# Large Scale Test Cases
# ------------------------------

def test_google_tile_large_scale_all_tiles_zoom8(mercator):
    # At zoom=8, there are 256x256 tiles
    zoom = 8
    max_index = 2**zoom - 1  # 255
    # Test a handful of representative tiles
    # Lower-left corner
    codeflash_output = mercator.GoogleTile(0, 0, zoom) # 625ns -> 900ns (30.6% slower)
    # Upper-right corner
    codeflash_output = mercator.GoogleTile(max_index, max_index, zoom) # 451ns -> 483ns (6.63% slower)
    # Center tile
    mid = max_index // 2
    codeflash_output = mercator.GoogleTile(mid, mid, zoom) # 347ns -> 221ns (57.0% faster)
    # Random tile
    codeflash_output = mercator.GoogleTile(123, 77, zoom) # 371ns -> 224ns (65.6% faster)

def test_google_tile_large_scale_batch(mercator):
    # Test a batch of tiles at zoom=9 (512x512 tiles)
    zoom = 9
    max_index = 2**zoom - 1  # 511
    # Test first 10 tiles along diagonal
    for i in range(10):
        tx, ty = i, i
        codeflash_output = mercator.GoogleTile(tx, ty, zoom); result = codeflash_output # 3.92μs -> 3.07μs (27.9% faster)
        expected = (tx, max_index - ty)

def test_google_tile_large_scale_last_row(mercator):
    # Test all tiles in last row at zoom=7 (128x128 tiles)
    zoom = 7
    max_index = 2**zoom - 1  # 127
    ty = max_index
    for tx in range(0, 128):
        codeflash_output = mercator.GoogleTile(tx, ty, zoom); result = codeflash_output # 45.2μs -> 26.7μs (69.0% faster)
        expected = (tx, 0)

def test_google_tile_large_scale_random_tiles(mercator):
    # Test 10 random tiles at zoom=6 (64x64 tiles)
    zoom = 6
    max_index = 2**zoom - 1  # 63
    test_cases = [
        (0, 0), (63, 63), (32, 32), (10, 50), (50, 10),
        (25, 37), (37, 25), (1, 62), (62, 1), (31, 31)
    ]
    for tx, ty in test_cases:
        codeflash_output = mercator.GoogleTile(tx, ty, zoom); result = codeflash_output # 3.71μs -> 3.00μs (23.5% faster)
        expected = (tx, max_index - ty)

# ------------------------------
# Additional Robustness/Mutation Tests
# ------------------------------

def test_google_tile_mutation_ty_off_by_one(mercator):
    # If the formula is off by one, this test will fail
    zoom = 5
    tx, ty = 10, 20
    expected = (tx, (2**zoom - 1) - ty)
    codeflash_output = mercator.GoogleTile(tx, ty, zoom); result = codeflash_output # 541ns -> 866ns (37.5% slower)

def test_google_tile_mutation_wrong_formula(mercator):
    # If the formula uses (2**zoom) instead of (2**zoom - 1), this test will fail
    zoom = 3
    tx, ty = 5, 7
    expected = (tx, (2**zoom - 1) - ty)
    codeflash_output = mercator.GoogleTile(tx, ty, zoom); result = codeflash_output # 545ns -> 838ns (35.0% slower)

def test_google_tile_mutation_swap_tx_ty(mercator):
    # If tx and ty are swapped, this test will fail
    zoom = 4
    tx, ty = 7, 9
    expected = (tx, (2**zoom - 1) - ty)
    codeflash_output = mercator.GoogleTile(tx, ty, zoom); result = codeflash_output # 540ns -> 844ns (36.0% slower)

def test_google_tile_mutation_wrong_sign(mercator):
    # If the formula is (2**zoom - 1) + ty, this test will fail
    zoom = 2
    tx, ty = 2, 1
    expected = (tx, (2**zoom - 1) - ty)
    codeflash_output = mercator.GoogleTile(tx, ty, zoom); result = codeflash_output # 551ns -> 870ns (36.7% slower)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-GlobalMercator.GoogleTile-mh4z2zhd and push.

Codeflash

The optimization introduces **caching for power-of-2 calculations** that eliminates redundant exponential operations in the frequently called `GoogleTile` method.

**Key optimization applied:**
- Added `_power_of_2_cache` dictionary to store pre-calculated `2**zoom - 1` values
- Cache lookup (`zoom in self._power_of_2_cache`) before expensive exponentiation
- Only calculates `2**zoom - 1` once per unique zoom level, then reuses cached result

**Why this leads to speedup:**
- The original code computed `2**zoom - 1` on every single call (1626 hits taking 801,830ns total)
- Exponentiation is computationally expensive, especially for larger zoom values
- The optimized version shows cache hits (1584 times) vastly outperform cache misses (42 times)
- Cache lookups are O(1) dictionary operations vs O(log n) exponentiation complexity

**Performance characteristics from tests:**
- **First calls to new zoom levels are slower** (30-40% overhead) due to cache population
- **Subsequent calls to same zoom levels are significantly faster** (50-70% speedup)
- **Large-scale batch operations show major gains** (57-69% faster) when processing many tiles at the same zoom level
- **Optimization shines for repeated zoom usage**, which is typical in tile generation workflows where entire zoom levels are processed sequentially

This caching strategy is particularly effective for tile map generation where applications typically process all tiles at a specific zoom level before moving to the next level.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 October 24, 2025 14:55
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Oct 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants