Update GT4Py/DaCe to get "Debug Backend" and "Schedule Tree Bridge"#170
Merged
Update GT4Py/DaCe to get "Debug Backend" and "Schedule Tree Bridge"#170
Conversation
Remove un-needed passes in pipeline
Point the submodules to oir->stree->sdfg branches in gt4py and dace. This allows to run NDSL tests against these branches.
splittable regions were based on StencilComputation library nodes that don't exist anymore since we torched the old bridge. To be re-implemented at the stree level (if we still want to keep it).
…o feature/oir_stree_sdfg_bridge
dace -> Move main visitor out and remove print statements gt4py -> Update DaCe version (remove debug print statements)
- dace: include Phil's fix to cycle detection - gt4py: update dace dependency and fix cpu memory layout
- dace: fix write access caching - gt4py: DDE issue to be investigated
- dace: patch DDE not to attempt to inline pointers - gt4py: re-enable DDE
twicki
approved these changes
Jul 29, 2025
Collaborator
twicki
left a comment
There was a problem hiding this comment.
Looks good! Thanks for the update
2 tasks
jjuyeonkim
pushed a commit
to jjuyeonkim/NDSL
that referenced
this pull request
Sep 8, 2025
PR GridTools/gt4py#2067 in GT4Py fundamentally changed how the `dace:*` backends behave. In that PR, we changed the strategy to make use of an upcoming DaCe feature called "Schedule Tree", which we will use for optimization purposes. This new "bridge" between GT4Py and DaCe, allows for a much cleaner design where both packages handle nothing more than what they need to. A drop in performance is to be expected (especially on CPU) as we have deactivated local caching for now. But the very next task is to re-use this new platform to allow for much more improved and aggressive merging capacities, local caching and hardware-driven tiling. This PR updates GT4Py to a version that includes the above mentioned "Schedule tree bridge" and removes NDSL-level optimizations in the orchestration pipeline. As said, this will come with a temporary dip in performance, which we plan to restore with upcoming pull requests. In addition to updating the GT4Py and DaCe versions, we include the following two changes in this PR 1. Expose compiler optimization level as `GT4PY_COMPILE_OPT_LEVEL`, defaults to `3` as before. 2. Minor change in import style in `ndsl/dsl/dace/orchestration.py`. --------- Co-authored-by: Florian Deconinck <deconinck.florian@gmail.com> Co-authored-by: Roman Cattaneo <1116746+romanc@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
PR GridTools/gt4py#2067 in GT4Py fundamentally changed how the
dace:*backends behave. In that PR, we changed the strategy to make use of an upcoming DaCe feature called "Schedule Tree", which we will use for optimization purposes. This new "bridge" between GT4Py and DaCe, allows for a much cleaner design where both packages handle nothing more than what they need to. A drop in performance is to be expected (especially on CPU) as we have deactivated local caching for now. But the very next task is to re-use this new platform to allow for much more improved and aggressive merging capacities, local caching and hardware-driven tiling.This PR updates GT4Py to a version that includes the above mentioned "Schedule tree bridge" and removes NDSL-level optimizations in the orchestration pipeline. As said, this will come with a temporary dip in performance, which we plan to restore with upcoming pull requests.
The GT4Py update in this PR also includes the "Debug Backend". A plain python backend, useful for prototyping of new DSL features and (to some extent) debugging. The "Debug Backend" was designed for readability and is glacially slow for real-world sized problems. Don't try to run anything large with this backend.
In addition to updating the GT4Py and DaCe versions, we include the following two changes in this PR
GT4PY_COMPILE_OPT_LEVEL, defaults to3as before.ndsl/dsl/dace/orchestration.py.How Has This Been Tested?
CI is green. Additional local testing with PyFV3 translate tests and the AI2 data.
Checklist: