Replies: 1 comment
-
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I started to explore different backends and building out a timing function, some pretty nice gains are on the table it seems.
Options:
Numba
A just-in-time compiler for Python that can accelerate NumPy-style code with minimal syntax changes. In practice, many of the NumPy APIs we're using aren’t yet supported or vectorized under Numba, so we'd end up rewriting most functions—or it still fallbacks to Python object mode.
Taichi (via LLVM)
A data-oriented DSL that compiles kernels down through LLVM. We write Taichi-style kernels rather than plain Python, but it gives you explicit control over parallelism and memory layout on CPU (and GPU). It’s similar in concept to NVIDIA Warp: we trade a new API for high throughput and fine-grained scheduling.
C++ (did not test this)
Hand-written C++ lets us exploit every ounce of performance—SIMD intrinsics, custom memory pools, zero-overhead abstractions—but requires a full rewrite plus build tooling, Python bindings (pybind11/CPython API), and careful maintenance of cross-platform builds.
Ran the test on calculating the coagulation gain rate
We want efficiency to be close to 1 or greater (it accounts for cpu clock speed, so should be the same on different computers)
call=the function call
cycles= cpu cycles
@mahf708 @wkchuang
Beta Was this translation helpful? Give feedback.
All reactions