Skip to content

Add OCS and GCS Backend Support to clustermq #341

@ernst-bablick

Description

@ernst-bablick

Hi Michael,

I would like to contribute support for Open Cluster Scheduler (OCS) and Gridware Cluster Scheduler (GCS) to clustermq. Both are successors of Sun Grid Engine (SGE) and Univa Grid Engine (UGE).

Background

While testing the existing SGE backend, I observed that it does not work out of the box with OCS/GCS and requires adjustments.

We currently have a prospect who intends to use clustermq on both OCS and GCS clusters concurrently. For this reason, we require distinct scheduler templates for each system. I have implemented the necessary changes in a forked repository and created product-specific templates for both OCS and GCS.

Before opening a pull request, I would like to clarify a few implementation questions and discuss design considerations.

Questions

1. Array Job Behavior and Worker Cancellation

In SGE-like systems, workloads are typically scheduled as array jobs. This means that workers belonging to a single Q() call may not start simultaneously. As a result, there can be situations where some workers remain queued even though the workload has already completed.

It appears that remaining queued workers are not automatically canceled when the master process finishes.

  • Is there currently a mechanism to cancel remaining queued workers?
  • If not, where would be the most appropriate place in the architecture to implement such cancellation logic?

2. The cores Parameter in SGE and PBS Backends

The SGE and PBS backends use a cores parameter to request multiple cores per worker job (possibly via a parallel environment). I would like clarification on:

  • When and how is this parameter used?
  • Does this relate to MPI-based R workloads?
  • Or is it primarily intended to allocate multiple scheduler slots/cores per clustermq worker process?

From my observation, Q() does not expose a cores argument directly. How and where is this parameter expected to be configured?

As I am relatively new to R, I would also appreciate insight into whether there are common R workloads requiring MPI-style parallelism within a clustermq worker, or whether this parameter primarily controls resource allocation.

3. CPU and Memory Binding Features (GCS)

GCS provides enterprise binding features that allow binding jobs to specific hardware units such as:

  • Hardware threads
  • CPU cores
  • Dies
  • Sockets
  • CPU caches
  • NUMA nodes

This enables the scheduler to place jobs on hardware resources sharing cache or memory locality, which can significantly improve performance for certain workloads.

Does clustermq currently provide a way to expose or control scheduler-level binding features?

If not, would it be acceptable to extend Q() with an optional parameter that allows users to specify binding requirements for their jobs?

In some scenarios, assigning one CPU core per worker is not optimal. Binding to a NUMA node or cache domain may provide better performance characteristics, but I am unsure if this also applies to R workloads.

Next Steps

I would be happy to share my fork and proposed changes for review. Before opening a pull request, I would appreciate some guidance.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions