Skip to content

CropSuite does not respect allocated CPUs by SLURM #5

@LenLon

Description

@LenLon

I start CropSuite using SLURM with the following SBATCH arguments:

#!/bin/bash
#SBATCH --job-name=crop_suite
#SBATCH --qos=short
#SBATCH --partition=standard
#SBATCH --cpus-per-task=24
#SBATCH --account=agrica
#SBATCH --output=slurm/%x_%j.log      
#SBATCH --error=slurm/%x_%j.log    

While squeue shows the job with the correctly allocated 24 CPUs per the #SBATCH --cpus-per-task=24 argument, CropSuite's internal multiprocessing.cpu_count() method seems to just give back the number of CPUs on the current node (128), as evidenced by these outputs in the SLURM job logs:

-> Downscaling RRPCF data for alfalfa under rf conditions
 -> Downscaling RRPCF data for avocado under rf conditions
 -> Downscaling RRPCF data for banana under rf conditions
 -> Downscaling RRPCF data for beans under rf conditions
Using 127 workers
-> Data loaded
Limiting to 127 cores
 -> Processing day #0                                      
 -> Processing day #2                                      
 -> Processing day #1                                      
 -> Processing day #4        

This gridlocks the whole HPC node :-(
Our IT demands that an environment variable in the Python code has to be set, namely PYTHON_CPU_COUNT . This limits the used CPUs if used in tandem with os.process_cpu_count(), though from Python 3.13 onwards this environment variable also works with multiprocessing.cpu_count().

Given that I have no idea about Python, where should I set this environment variable. Just in CropSuite.py?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions