Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions Examples/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -145,6 +145,8 @@ function(add_warpx_test
)
# FIXME Use helper function to handle Windows exceptions
set_property(TEST ${name}.run APPEND PROPERTY ENVIRONMENT "PYTHONPATH=$ENV{PYTHONPATH}:${CMAKE_PYTHON_OUTPUT_DIRECTORY}")
# allocate GPU memory on-demand for Python tests too (same reason as native tests below)
set_property(TEST ${name}.run APPEND PROPERTY ENVIRONMENT "AMREX_THE_ARENA_INIT_SIZE=0")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good-ish, but should be generalized for both tests to use: AMREX_DEFAULT_INIT

AMReX-Codes/amrex#4947

else()
# TODO Use these for Python tests too
set(runtime_params
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,7 @@ def main(args):
OpenPMDTimeSeries(args.path)
except Exception:
print("Could not open the file as a plotfile or an openPMD time series")
args.format = "openpmd"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this assignment?
All file format checks have failed at this point, no?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moreover, analysis_default_regression.py here should be just a link to https://github.com/BLAST-WarpX/warpx/blob/development/Examples/analysis_default_regression.py. Fixing the code within the test directory won't work. If it is not a link, but a hard copy, that should be fixed in the first place.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like the script here changes the args.rtol though for restart... Needs some generalization.

Do you like to simplify the format detection logic? @EZoni
@lucafedeli88 suggests:

I would simply change the logic and exit with an error message if the script can't read the output file.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The link issue is fixed in #6810. Further fixes to the underlying code, if needed, should be done in the original file.

Copy link
Copy Markdown
Member

@EZoni EZoni Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#6810 and #6811 fix the link issues, but coming back to the original suggestion, @ax3l and @lucafedeli88,

Do you like to simplify the format detection logic? @EZoni
@lucafedeli88 suggests:

I would simply change the logic and exit with an error message if the script can't read the output file.

I think I don't understand it.

The current code is

# set args.format automatically
try:
yt.load(args.path)
except Exception:
try:
OpenPMDTimeSeries(args.path)
except Exception:
print("Could not open the file as a plotfile or an openPMD time series")
else:
args.format = "openpmd"
else:
args.format = "plotfile"

so an error ("Could not open the file as a plotfile or an openPMD time series") is already raised as Exception if the code can't read the output file:

  1. It first tries to open args.path using yt.load(args.path) .
  2. If that succeeds, it sets args.format = "plotfile" via the else clause.
  3. If that fails, it falls into the outer except and tries OpenPMDTimeSeries(args.path) instead.
  4. If that succeeds, it sets args.format = "openpmd".
  5. If that also fails, it prints an error message and args.format is never set.

Copy link
Copy Markdown
Member

@EZoni EZoni Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is it that does not work with this and/or what is it that you would like to change? Simply adding a sys.exit(1) after the error message?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's my attempt of interpretation, let me know if this is what you had in mind: #6812.

Copy link
Copy Markdown
Contributor Author

@zippylab zippylab Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without this change, args.format remains unset if the inner exception is thrown, correct? Subsequent usage of args.format could be problematic. Something that exits before main() is invoked would avoid passing down an unset args.format. (E.G., remove the extra else: in the inner try block and replace it with sys.exit(1) as suggested.) If all the code is robust against/agnostic to an unset value, then it doesn't matter and the code could stay as it was before this commit. @EZoni, your fix in #6812 handles it the best way.

else:
args.format = "openpmd"
else:
Expand Down
4 changes: 2 additions & 2 deletions Examples/Tests/electrostatic_sphere_eb/analysis_rz_mr.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,15 +26,15 @@
def find_first_non_zero_from_bottom_left(matrix):
for i in range(matrix.shape[0]):
for j in range(matrix.shape[1]):
if (matrix[i][j] != 0) and (matrix[i][j] != np.nan):
if (matrix[i][j] != 0) and (not np.isnan(matrix[i][j])):
return (i, j)
Comment on lines +29 to 30
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good fix. np.nan != np.nan is always True, so the old code was buggy. (NaN is never equal to anything, including itself.)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isolated to faster review & merge to #6807

return i, j


def find_first_non_zero_from_upper_right(matrix):
for i in range(matrix.shape[0] - 1, -1, -1):
for j in range(matrix.shape[1] - 1, -1, -1):
if (matrix[i][j] != 0) and (matrix[i][j] != np.nan):
if (matrix[i][j] != 0) and (not np.isnan(matrix[i][j])):
return (i, j)
return i, j

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ my_constants.dt = 0.1/wpe # s
#################################
max_step = 20
amr.n_cell = 40 40
amr.max_grid_size = 8
amr.max_grid_size = 16
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks a bit random: Why did you increase this but not blocking factor, too?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was explained here. #6801 (comment)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, saw also PR description now.

Adjust test inputs for GPU compatibility (tiling assertion, grid decomposition warnings)

I think we probably want to increase max_grid_size and blocking factor for the tests that throw warnings.

amr.blocking_factor = 8
amr.max_level = 0
geometry.dims = 2
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ my_constants.dt = 0.1/wpe # s
#################################
max_step = 20
amr.n_cell = 40 40
amr.max_grid_size = 8
amr.max_grid_size = 16
amr.blocking_factor = 8
amr.max_level = 0
geometry.dims = 2
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ FILE = inputs_base_2d
# test input parameters
algo.maxwell_solver = psatd
amr.max_level = 1
amr.max_grid_size = 128
amr.ref_ratio = 4
diag1.electrons.variables = x z w ux uy uz
diag1.positrons.variables = x z w ux uy uz
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ algo.particle_shape = 1
algo.current_deposition = direct

particles.species_names = beam
particles.do_tiling = 1
particles.do_tiling = 0
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks a bit random: Why changed?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without this change, the test_3d_magnetostatic_eb_run fails on the AMREX_ALWAYS_ASSERT below when run on a GPU:

 1358 // The GPU implementation of Redistribute
 1359 //
 1360 template <typename ParticleType, int NArrayReal, int NArrayInt,
 1361           template<class> class Allocator, class CellAssignor>
 1362 void
 1363 ParticleContainer_impl<ParticleType, NArrayReal, NArrayInt, Allocator, CellAssignor>
 1364 ::RedistributeGPU (int lev_min, int lev_max, int nGrow, int local, bool remove_negative)
 1365 {
 1366 #ifdef AMREX_USE_GPU
 1367
 1368     if (local) { AMREX_ASSERT(numParticlesOutOfRange(*this, lev_min, lev_max, local) == 0); }
 1369
 1370     // sanity check
 1371     AMREX_ALWAYS_ASSERT(do_tiling == false);

From the run without changing the input parameter:

 8/55 Test #503: test_3d_magnetostatic_eb.run .....................................***Failed    2.35 sec
Initializing AMReX (26.04-68-g7e9ce72d229c)...
MPI initialized with 1 MPI processes
MPI initialized with thread support level 3
Initializing SYCL...
Multiple GPUs are visible to each MPI rank. This is usually not an issue. But this may lead to incorrect or suboptimal rank-to-GPU mapping.!
SYCL initialized with 1 device.
AMReX (26.04-68-g7e9ce72d229c) initialized
PICSAR (25.06)
WarpX (26.03-105-g18012c92d2bb)

                                   ___
    __        __             ___  /  /
    \ \      / /_ _ _ __ _ __\  \/  /
     \ \ /\ / / _` | '__| '_ \\    /
      \ V  V / (_| | |  | |_) /    \
       \_/\_/ \__,_|_|  | .__/__/\  \
                        |_|       \__\

Level 0: dt = 1e-12 ; dx = 0.0078125 ; dy = 0.0078125 ; dz = 0.015625
terminate called after throwing an instance of 'std::runtime_error'
  what():  Assertion `do_tiling == false' failed, file "/home/zippy/src/warpx/build_RelWithDebInfo/_deps/fetchedamrex-src/Src/Particle/AMReX_ParticleContainerI.H", line 1371
SIGABRT
See Backtrace.0 file for details
Abort(6) on node 0 (rank 0 in comm 496): application called MPI_Abort(comm=0x84000000, 6) - process 0
x1922c6s0b0n0: rank 0 exited with code 6

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ouch, yes! Fix in #6809

beam.mass = m_e
beam.charge = -q_e
beam.injection_style = nuniformpercell
Expand Down
1 change: 1 addition & 0 deletions Examples/analysis_default_regression.py
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,7 @@ def main(args):
OpenPMDTimeSeries(args.path)
except Exception:
print("Could not open the file as a plotfile or an openPMD time series")
args.format = "openpmd"
else:
args.format = "openpmd"
else:
Expand Down
42 changes: 39 additions & 3 deletions Examples/analysis_default_restart.py
Original file line number Diff line number Diff line change
Expand Up @@ -49,18 +49,54 @@ def check_restart(filename, tolerance=1e-12):
dims=ds_benchmark.domain_dimensions,
)

# Loop over all fields (all particle species, all particle attributes, all grid fields)
# and compare output data generated from initial run with output data generated after restart
# Separate grid fields from particle fields. Particle fields use the
# species name as field type; grid fields use 'boxlib'.
particle_species = set()
grid_fields = []
for field in ds_benchmark.field_list:
ftype, fname = field
if ftype == "boxlib":
grid_fields.append(field)
elif ftype != "all":
particle_species.add(ftype)

print(f"\ntolerance = {tolerance}")
print()
for field in ds_benchmark.field_list:

# Compare grid fields directly (order is deterministic)
for field in grid_fields:
dr = ad_restart[field].squeeze().v
db = ad_benchmark[field].squeeze().v
error = np.amax(np.abs(dr - db))
if np.amax(np.abs(db)) != 0.0:
error /= np.amax(np.abs(db))
print(f"field: {field}; error = {error}")
assert error < tolerance

# Compare particle fields sorted by (particle_cpu, particle_id), since
# Redistribute() after checkpoint-restart may reorder particles across
# tiles/ranks. The (cpu, id) pair is the unique particle key in AMReX.
for species in sorted(particle_species):
species_fields = [f for f in ds_benchmark.field_list if f[0] == species]

Comment on lines +76 to +81
Copy link
Copy Markdown
Member

@ax3l ax3l Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to understand what does this patch improves: The stability of checksums?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Redistribute after the restart from the checkpoint can reorder particle IDs across ranks, e.g. Sorting them by ID orders the compared checksum calculations so they agree more closely. Some tests were failing that checksum comparison without it.

id_r = np.atleast_1d(ad_restart[(species, "particle_id")].squeeze().v)
id_b = np.atleast_1d(ad_benchmark[(species, "particle_id")].squeeze().v)
cpu_r = np.atleast_1d(ad_restart[(species, "particle_cpu")].squeeze().v)
cpu_b = np.atleast_1d(ad_benchmark[(species, "particle_cpu")].squeeze().v)

sort_r = np.lexsort((id_r, cpu_r))
sort_b = np.lexsort((id_b, cpu_b))

for field in species_fields:
if field[1] in ("particle_id", "particle_cpu"):
continue
dr = np.atleast_1d(ad_restart[field].squeeze().v)[sort_r]
db = np.atleast_1d(ad_benchmark[field].squeeze().v)[sort_b]
error = np.amax(np.abs(dr - db))
if np.amax(np.abs(db)) != 0.0:
error /= np.amax(np.abs(db))
print(f"field: {field}; error = {error}")
assert error < tolerance
print()


Expand Down
32 changes: 31 additions & 1 deletion Python/pywarpx/_libwarpx.py
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,17 @@ def load_library(self):
"Please write separate scripts for each geometry."
)

# Import mpi4py before the pyAMReX (amrex.space*d) shared library is
# loaded so that mpi4py calls MPI_Init_thread first. With the Cray
# MPICH on Polaris/Sirius, if the AMReX shared library is loaded before
Comment on lines +74 to +76
Copy link
Copy Markdown
Member

@ax3l ax3l Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to talk about this.

Generally we support both, either AMReX-initialized MPI or externally (e.g., mpi4py initialized MPI).

We need to see your reproducer where you had issues loading, so we can understand this better.
The problem does not appear for us on other Cray MPICH systems like Perlmutter (NERSC) or Frontier (OLCF) or Tuolumne (LLNL) or Adastra.

Your patch here disables one of the two modes we support.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue with ordering of import mpi4py is Polaris-specific. It has different Cray MPICH version and other differences in main software versions w.r.t. Perlmutter, and we've had multiple user issues reported that required re-ordering import statements like this. Since it's breaking the other mode, though, it clearly needs a different approach. I'll break this out into an issue with the test that demonstrates it identified.

# mpi4py initializes MPI, mpi4py's later MPI_Init_thread conflicts with
# the already-loaded MPI symbols and causes hangs or unbounded memory
# allocation during amrex::Initialize().
try:
from mpi4py import MPI # noqa: F811,F401
except ImportError:
Comment thread
github-advanced-security[bot] marked this conversation as resolved.
Fixed
pass # mpi4py is optional; MPI_Init handled by AMReX if absent

# --- Use geometry to determine whether to import the 1D, 2D, 3D or RZ version.
# --- The geometry must be setup before the lib warpx shared object can be loaded.
try:
Expand Down Expand Up @@ -147,7 +158,7 @@ def load_library(self):
register_warpx_WarpXParticleContainer_extension(self.libwarpx_so)

def amrex_init(self, argv, mpi_comm=None):
if mpi_comm is None: # or MPI is None:
if mpi_comm is None:
self.libwarpx_so.amrex_init(argv)
else:
raise Exception("mpi_comm argument not yet supported")
Expand All @@ -172,9 +183,28 @@ def finalize(self, finalize_mpi=1):
"""
# TODO: simplify, part of pyAMReX already
if self.initialized:
# GPU finalization workaround: on GPU backends, destroying the C++
# WarpX object (del self.warpx) or calling amrex_finalize() can
# crash in SYCL/CUDA/HIP Device::Finalize() when it tries to
# synchronize streams whose backing static objects are already gone.
Comment on lines +187 to +189
Copy link
Copy Markdown
Member

@ax3l ax3l Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zippylab reproducer/example needed please. Not clear when this happens. This patch looks like a bandage.

cc @WeiqunZhang in case this general MPI_Barrier attempt here appears useful for amrex::Finalize()...

Copy link
Copy Markdown
Member

@ax3l ax3l Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR description points to issues in:

On all GPU backends, the Python atexit finalization can crash in amrex::Gpu::Device::Finalize() when it synchronizes streams whose backing C++ static objects (external_stream_stack) have already been destroyed.

This could be from external streams maybe initialized by either external MPI or by another lib not shown as reproducer, e.g., cupy/dpnp ...? Reproducer would be super helpful, because clearly the patch here cannot be merged.

This is an aggressive workaround — it bypasses all C++ cleanup and relies on the job launcher for process cleanup. The proper fix requires changes in AMReX to address the static destructor ordering in Gpu::Device. Without this workaround, PICMI tests with large particle counts (e.g., test_2d_ohm_solver_landau_damping_picmi) are marked FAILED despite completing the simulation correctly, because ctest sees the crash exit code.

Moved upstream to #5386

Copy link
Copy Markdown
Contributor Author

@zippylab zippylab Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for putting in the AMReX Issue, @ax3l . I'll find one of the example reproducers from my notes.

# Exit immediately via os._exit(0) BEFORE any C++ destructors run,
# and let the job launcher (PALS/mpiexec) handle cleanup.
try:
if self.libwarpx_so.Config.gpu_backend in ("SYCL", "CUDA", "HIP"):
try:
from mpi4py import MPI

MPI.COMM_WORLD.Barrier()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So your strategy here is to call MPI Barrier, see that it crashes and os._exit(0), i.e., w/o an error code? This looks like a massive hack, no? :)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Codex comment:

High: Python/pywarpx/_libwarpx.py:200 calls os._exit(0) from finalize(), which
is both an atexit handler and public API via pywarpx.warpx.finalize() / PICMI
Simulation.finalize(). On GPU backends this can turn an uncaught Python
exception after WarpX initialization into a successful process exit, hiding
failed tests. It also makes explicit finalize() terminate the interpreter
instead of returning. This should be split from normal/manual finalization,
and it should not force exit code 0 on error paths.

except Exception:
Comment thread
github-advanced-security[bot] marked this conversation as resolved.
Fixed
pass # best-effort barrier; proceeding to _exit
os._exit(0)
except Exception:
Comment thread
github-advanced-security[bot] marked this conversation as resolved.
Fixed
pass # Config may not be available; fall through to normal cleanup

del self.warpx
# The call to warpx_finalize causes a crash - don't know why
# self.libwarpx_so.warpx_finalize()

self.libwarpx_so.amrex_finalize()

from pywarpx import callbacks
Expand Down
4 changes: 4 additions & 0 deletions Source/Diagnostics/WarpXOpenPMD.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -593,6 +593,10 @@ for (const auto & particle_diag : particle_diags) {
particlesConvertUnits(ConvertDirection::SI_to_WarpX, pc, mass);
}

// On GPU backends, copyParticles may be asynchronous; synchronize before
// passing pinned-memory pointers to openPMD's storeChunkRaw.
amrex::Gpu::streamSynchronize();
Comment on lines +596 to +598
Copy link
Copy Markdown
Member

@ax3l ax3l Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry Claude, it may not be asynchronous, if you had taken the step to check the header instead of guessing:
https://github.com/AMReX-Codes/amrex/blob/26.04/Src/Particle/AMReX_ParticleTransformation.H#L148-L198

Please see:
https://warpx.readthedocs.io/en/latest/developers/how_to_develop_with_llms.html

Copy link
Copy Markdown
Member

@ax3l ax3l Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please provide a minimal reproducer that failed? All the ops above/below are ParIter loops that always auto-synchronize, so I cannot spot an async race condition here. But there might be something we overlook.


// Gather the electrostatic potential (phi) on the macroparticles
if ( particle_diag.m_plot_phi ) {
storePhiOnParticles( tmp, WarpX::electrostatic_solver_id, !use_pinned_pc );
Expand Down
28 changes: 23 additions & 5 deletions Source/Particles/ParticleCreation/DefaultInitialization.H
Original file line number Diff line number Diff line change
Expand Up @@ -14,13 +14,31 @@
# include "Particles/ElementaryProcess/QEDInternals/QuantumSyncEngineWrapper.H"
#endif

#include <AMReX_GpuAllocators.H>
#include <AMReX_GpuContainers.H>
#include <AMReX_REAL.H>

#include <cmath>
#include <map>
#include <string>

namespace ParticleCreation {
/** Whether particle data with allocator Alloc lives on the GPU.
* amrex::RunOnGpu is not specialised for PolymorphicArenaAllocator,
* so we extend the check here. PolymorphicArenaAllocator in WarpX
* always wraps a device arena, so it is safe to treat it as GPU. */
template <typename Alloc>
constexpr bool particles_on_gpu ()
{
#ifdef AMREX_USE_GPU
return amrex::RunOnGpu<Alloc>::value
|| amrex::IsPolymorphicArenaAllocator<Alloc>::value;
#else
return false;
#endif
}
} // namespace ParticleCreation

/**
* \brief This set of initialization policies describes what happens
* when we need to create a new particle due to an elementary process.
Expand Down Expand Up @@ -158,7 +176,7 @@ void DefaultInitializeRuntimeAttributes (PTile& ptile,
const QuantumSynchrotronGetOpticalDepth quantum_sync_get_opt =
p_qs_engine->build_optical_depth_functor();
// If the particle tile was allocated in a memory pool that can run on GPU, launch GPU kernel
if constexpr (amrex::RunOnGpu<typename PTile::template AllocatorType<amrex::Real>>::value) {
if constexpr (ParticleCreation::particles_on_gpu<typename PTile::template AllocatorType<amrex::Real>>()) {
Copy link
Copy Markdown
Member

@ax3l ax3l Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@atmyers @WeiqunZhang this could be a legit bug from transition to polymorphic PC, do I see that right? Bug generally speaking, this cannot be a constexpr at all anymore (I think the flagging of this line is right, the patch I would write as arena runtime check...). Do you agree?

The patch no treats every PolymorphicArenaAllocator as GPU-resident, which is wrong. (The development code treats every PolymorphicArenaAllocator as CPU-resident, which is wrong, too.)

This pattern is very likely used more often in the code base, both WarpX (5x in init) and AMReX.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. If it is polymorphic, the check can only be done at runtime with arena()->isManaged() || arena()->isDevice().

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix isolated in #6808

amrex::ParallelForRNG(stop - start,
[=] AMREX_GPU_DEVICE (int i, amrex::RandomEngine const& engine) noexcept {
const int ip = i + start;
Expand All @@ -185,7 +203,7 @@ void DefaultInitializeRuntimeAttributes (PTile& ptile,
const BreitWheelerGetOpticalDepth breit_wheeler_get_opt =
p_bw_engine->build_optical_depth_functor();;
// If the particle tile was allocated in a memory pool that can run on GPU, launch GPU kernel
if constexpr (amrex::RunOnGpu<typename PTile::template AllocatorType<amrex::Real>>::value) {
if constexpr (ParticleCreation::particles_on_gpu<typename PTile::template AllocatorType<amrex::Real>>()) {
amrex::ParallelForRNG(stop - start,
[=] AMREX_GPU_DEVICE (int i, amrex::RandomEngine const& engine) noexcept {
const int ip = i + start;
Expand Down Expand Up @@ -214,7 +232,7 @@ void DefaultInitializeRuntimeAttributes (PTile& ptile,
const amrex::ParserExecutor<7> user_real_attrib_parserexec =
user_real_attrib_parser[ia]->compile<7>();
// If the particle tile was allocated in a memory pool that can run on GPU, launch GPU kernel
if constexpr (amrex::RunOnGpu<typename PTile::template AllocatorType<amrex::Real>>::value) {
if constexpr (ParticleCreation::particles_on_gpu<typename PTile::template AllocatorType<amrex::Real>>()) {
amrex::ParallelFor(stop - start,
[=] AMREX_GPU_DEVICE (int i) noexcept {
const int ip = i + start;
Expand Down Expand Up @@ -246,7 +264,7 @@ void DefaultInitializeRuntimeAttributes (PTile& ptile,
if (it_ioniz != particle_icomps.end() &&
std::distance(particle_icomps.begin(), it_ioniz) == j)
{
if constexpr (amrex::RunOnGpu<typename PTile::template AllocatorType<int>>::value) {
if constexpr (ParticleCreation::particles_on_gpu<typename PTile::template AllocatorType<int>>()) {
amrex::ParallelFor(stop - start,
[=] AMREX_GPU_DEVICE (int i) noexcept {
const int ip = i + start;
Expand All @@ -268,7 +286,7 @@ void DefaultInitializeRuntimeAttributes (PTile& ptile,
{
const amrex::ParserExecutor<7> user_int_attrib_parserexec =
user_int_attrib_parser[ia]->compile<7>();
if constexpr (amrex::RunOnGpu<typename PTile::template AllocatorType<int>>::value) {
if constexpr (ParticleCreation::particles_on_gpu<typename PTile::template AllocatorType<int>>()) {
amrex::ParallelFor(stop - start,
[=] AMREX_GPU_DEVICE (int i) noexcept {
const int ip = i + start;
Expand Down
Loading