$ git fetch origin wjy/recurse
$ git checkout wjy/recurse
$ _bn && NVFUSER_DISABLE=parallel_compile python repro.py
NVFUSER_DISABLE=parallel_compile is there to display the call stack better.
To make debugging easier, I limited recursion of lessEqual to 100 levels. The reproducer is still valid but slower with that check removed.