Skip to content

[website] OpenCL convolution: performance and robustness cleanups #123

@paxcalpt

Description

@paxcalpt

Found auditing the website branch (OpenCL convolution; also in PR #106). The three critical correctness bugs are fixed in #122 — this issue tracks the remaining non-critical items in __opencl__.py / image_convolution.py.

MEDIUM — per-call context + kernel recompile. convolve1D_opencl creates a fresh cl.Context, cl.CommandQueue, re-reads the .cl file and re-runs cl.Program(...).build() on every call — i.e. 3 context creations + 3 JIT compiles per 3D convolution, looped over every frame in generate_frames_volume_convolution. Buffers are never explicitly released. This negates the performance benefit the GPU path is meant to provide. Cache the context/queue/compiled program (module- or device-scoped) and reuse.

LOW — dead fp64 check. In __opencl__.py the cl64/fp64-extension conditional sets cl_dp = False in both branches, so double precision is never enabled and _get_cl_code always rewrites doublefloat. Harmless (kernel only uses float) but misleading dead code.

LOW — single-device path skips the GPU-type filter. When exactly one platform+device exists, it's selected as _fastest_device without the "GPU" in device_type check applied in the multi-device branch. A CPU-only OpenCL runtime (e.g. PoCL) would then drive the GPU path on a CPU device.

LOW — debug print + redundant except tuple. except (ImportError, OSError, Exception)Exception subsumes the others; and print("This exception is what's causing cl equals None:", e) prints on every import on machines without pyopencl. Reduce to except Exception and drop/downgrade the print to warnings.warn/logging.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions