Replies: 1 comment
-
|
Hi @dhjx1996 , I already implemented My implementation does not do anything special to optimise. I therefore had to manually change some of your expressions so that the path was not repeating, and so that BLAS operations (often matrix-vector multiply) are activated whenever possible. In timing, I think we spend currently 90% or so of our time running the DISORT port inside the lapack functions. Especially I am not sure we could save much time in the code unless we optimise the core lapack. I do not think we can optimise my One of the places CDISORT is significantly faster is when it computes pure fluxes. Somehow, it must be using a completely different path through the code, or some optimisation I am not aware of. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi @riclarsson, this is an idea I had ever since we discussed the importance of
numpy.einsumin making PythonicDISORT fast, sorry that I am following up on it only now (and maybe you have already done this).You probably know that all of NumPy, SciPy not to mention many base Python functions are implemented in C. Here is the source code for
np.einsum: https://github.com/numpy/numpy/blob/e2805398f9a63b825f4a2aab22e9f169ff65aae9/numpy/core/src/multiarray/einsum.c.src. Would such C source code help in porting important NumPy, SciPy and other Python functions into ARTS? A well-builteinsumalgorithm may also greatly speed up tensor operations in other parts of ARTS. In this vein, the C source code of thescipy.sparselibrary may be worth looking into as well.Beta Was this translation helpful? Give feedback.
All reactions