Skip to content

Conversation

garth-wells
Copy link
Member

@garth-wells garth-wells commented Aug 28, 2025

The PR adds experimental 'single source' CI testing via Spack. All other tests are unchanged. The proposal is to merge this PR and monitor how it goes. While experimental, it will not be a merge requirement.

Background

Issues this PR addresses are:

  • Our test framework is not 'single source', it's a mish mash of Docker and Actions.
  • Our Docker files are complicated and bloated.
  • Control over versioning of Python packages in the test system is weak; some things are (built and) installed in the Dockerfile, others are installed in the CI framework. This has caused some recent testing bugs, especially on maintaining an older version of NumPy that is compatible with Numba.
  • We can't practicably test against new/different versions of dependencies and compilers. This requires changes to already complicated Docker files, docker builds and the simplest approach clobbers the previous Docker image tag, which can cause CI failures.

Spack-based CI

This PR address the above issues.

  • The test framework is single-source, the Actions yaml file controls the build and execution.
  • Versions of all dependencies can be controlled in a single yaml file.
  • Different dependencies can be tested in a branch, with no side-effects.
  • Spack packages are cached using the Github container repository (private at the moment, rather than being a public cache).
  • The test suite is a bit slower than what we presently have since the dependencies are pulled in from the cache one-by-one by Spack. There is scope to reduce the runtime.
  • The core Spack test system for running CI tests through Spack has several issues (some are on the Spack issue tracker). It is not suitable for running our tests at present.

This PR exposed a number of shortcomings/bugs in FEniCS Spack specs, and bugs in dependency Spack spec. This PR uses the specs in a branch spack-packages. The changes will eventually be upstreamed.

Issues and questions

  • This PR uses minimal dependency pinning. Should we pin significant dependencies, e.g. PETSc, and let others float?

  • If Spack cannot find a suitable cached dependency, it will re-build and cache. The re-build of some heavy dependencies, e.g. LLVM (for Numba) and VTK (for pyvista) can be very heavy. 'Controlled' rebuilds can be managed and just require patience or running locally on a powerful system and pushing to the cache. What will be problematic is unexpected, heavy re-builds since the CI will suddenly become very slow. Version pinning of heavy dependencies may avoid this.

    If we see too frequent re-builds, we can figure out what to do.

  • In this PR, all Python test dependencies are installed using Spack.

    Advantages:

    • Tight control over compatible dependency versions.
    • Self-contained, i.e. not dependency the host system libraries.

    Disadvantages:

    • Some Python packages have very deep and heavy dependency graphs, e.g. numba (LLVM dep), (py-)gmsh (lots of graphics library deps) and pyvista (VTK dep).
    • Quite a few Python packages in Spack are not actively/well maintained. This PR exposed bugs in some Spack Python specs.

    A question for later is: should Python test dependencies be installed via Spack or pip? pip would be faster and less package maintenance, but would require greater care with Python dependency versions.

    The Spack version is presently the development version. We may want to pin this. There are a few details still to add on specifying FEniCS dependency branches.

@jorgensd
Copy link
Member

With respect to our docker hiearchy, I think we should keep a Dockerfile.test-env and Dockerfile.end-user, with a simplified setup. For instance only support

  • PETSc 32 bit - real and complex mode

    • Compatible SLEPc's
  • Pyvista (which i think can be installed via pip now that VTK have arm images)

  • GMSH (some apt packages required to work)

  • ADIOS2 (I think we still need to do this from source to get a working python interface).
    and use the apt mpi compiler etc.

Im in favor of this as the current dockerfile's make it super easy for end-user/third party vendors to set up CI with minor additional dependencies, which would be incredibly slow to install with spack and a reconcretization.

What I didn't mention because it's adjacent rather than directly related is that Spack can produce Docker images (very easy, see https://spack.readthedocs.io/en/latest/containers.html#from-existing-installations), or Dockerfiles (https://spack.readthedocs.io/en/latest/containers.html#generating-recipes-for-docker-and-singularity), or build Docker images from an environment spec. This would be a lot cleaner than what we do now.

Spack CI builds are not slow when used with a build cache. Something to consider for the future is if we want to provide a public Spack mirror/cache (which are the same thing in Spack) for FEniCS.

I am aware of this feature, and it looks very neat. The problem is if a user has a dependency that isn't on spack (or is installed through pip, or you need to compile C++ code against dolfinx, as in the case of DOLFINx_MPC), which we then need to add to the spack spec (i.e. we need to install py-pip in the runtime images).
If we do not add py-pip, cmake, nanobind and all of our build-deps to those images, a user has to add then, reconcretize and possible go through the whole installation-phase of DOLFINx, as there might be incompatibilities with our build-cache.

@garth-wells
Copy link
Member Author

the same thing in Spack) for FEniCS.

I am aware of this feature, and it looks very neat. The problem is if a user has a dependency that isn't on spack (or is installed through pip, or you need to compile C++ code against dolfinx, as in the case of DOLFINx_MPC), which we then need to add to the spack spec

That's not correct - Spack can install just build dependencies, just as we do manually now in the Dockerfiles. Users can then install further whatever they like by hand.

@jorgensd
Copy link
Member

the same thing in Spack) for FEniCS.

I am aware of this feature, and it looks very neat. The problem is if a user has a dependency that isn't on spack (or is installed through pip, or you need to compile C++ code against dolfinx, as in the case of DOLFINx_MPC), which we then need to add to the spack spec

That's not correct - Spack can install just build dependencies, just as we do manually now in the Dockerfiles. Users can then install further whatever they like by hand.

you need to re-concretize to ensure that the same nanobind is used for both dolfinx and an extension. This might cause a reinstall of heavy dependencies. I’ve experienced this when for instance trying to use dolfinx_MPC with dolfinx installed with spack.

@jhale
Copy link
Member

jhale commented Aug 28, 2025

Let's not get sidetracked on Docker image design for now - none of us have even tried the Spack Docker generation yet.

run: |
. $GITHUB_WORKSPACE/spack-src/share/spack/setup-env.sh
spack env activate py
spack load gcc
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this to ensure jit compilation works?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this to ensure jit compilation works?

Yes.

@jorgensd
Copy link
Member

Looks ok as an extra way of testing dolfinx
. The changes in spack-packages by Garth are useful.

Only question I have is: why does one have to load gcc in the Python env? Should it be a run-time dependency, or is there a specific test that needs it?

spack info py-fenics-dolfinx
spack env create py dolfinx-src/.github/workflows/spack-config/gh-actions-env-test.yml
spack -e py develop --path $GITHUB_WORKSPACE/dolfinx-src fenics-dolfinx@main
spack -e py develop [email protected]=main
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should be checked out using the GitHub action at the git ref specified in the yml file.

Copy link
Member Author

@garth-wells garth-wells Aug 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be more code for no advantage.

What remains is to get the branch from the refs env file, i.e.

- name: Load environment variables
  run: cat .github/workflows/fenicsx-refs.env >> $GITHUB_ENV

- name: Set up env
   run: |
     ....
     spack -e py develop fenics-basix@git.${{ env.basix_ref }}=main

The only one we should checkout using actions is DOLFINx - the commit hash is different for 'pull request runs' and 'push runs' (Github creates internal commits for PR triggers). Getting the correct hash ref isn't trivial.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is an advantage: the GitHub action does very shallow clones and handles authentication even into private forks.

spack info fenics-dolfinx
spack env create cxx dolfinx-src/.github/workflows/spack-config/gh-actions-env-test.yml
spack -e cxx develop --path $GITHUB_WORKSPACE/dolfinx-src fenics-dolfinx@main
spack -e cxx develop [email protected]=main
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as below on checkout action and yml git ref.

@garth-wells
Copy link
Member Author

Looks ok as an extra way of testing dolfinx . The changes in spack-packages by Garth are useful.

Only question I have is: why does one have to load gcc in the Python env? Should it be a run-time dependency, or is there a specific test that needs it?

For JIT. I think it's an issue with how py-cffi detects compilers. Not something I've dug into and it ony pops up in strongly isolated builds (i.e. in minimal containers). Too many other more significant moving parts to sort out thus far. If a compiler should be a run dependency, it's not obvious to me which package it could come from.

It does make technical sense why one would like to load the compiler - one could use different C compilers for JIT.

@garth-wells garth-wells added this pull request to the merge queue Aug 31, 2025
Merged via the queue into main with commit d29c9fc Aug 31, 2025
18 checks passed
@garth-wells garth-wells deleted the spack-ci branch August 31, 2025 21:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci Continuous Integration
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants