Skip to content

Conversation

Allenz5
Copy link
Contributor

@Allenz5 Allenz5 commented Jun 9, 2025

Description of changes

  • Extend AllocationStrategy to include capacity-optimized-prioritized and prioritized options
  • Extend instances_validators.py::InstancesAllocationStrategyValidator to verify that the new AllocationStrategy works correctly with the specified CapacityType

Tests

  • Extend test_instances_validators.py::test_instances_allocation_strategy_validator to include new test cases

References

Checklist

  • Make sure you are pointing to the right branch.
  • If you're creating a patch for a branch other than develop add the branch name as prefix in the PR title (e.g. [release-3.6]).
  • Check all commits' messages are clear, describing what and why vs how.
  • Make sure to have added unit tests or integration tests to cover the new/modified code.
  • Check if documentation is impacted by this change.

Please review the guidelines for contributing and Pull Request Instructions.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

Allenz5 and others added 13 commits June 3, 2025 14:56
…d to AllocationStrategy configuration; Add new configuration of enable_single_availability_zone;

Signed-off-by: Hanxuan Zhang <[email protected]>
…nfig.cluster_config to pcluster.config.common
…ster_config.py to common.py"

This reverts commit 13d8cfa.
…dator in cluster_config.py and add registration tests
@Allenz5 Allenz5 requested review from a team as code owners June 9, 2025 20:37
@himani2411
Copy link
Contributor

I would suggest to add a title to the PR too

@Allenz5 Allenz5 changed the title [Subnet Prioritization] [Subnet Prioritization] Support capacity-optimized-prioritized and prioritized Allocation Strategy Jun 11, 2025
@himani2411
Copy link
Contributor

Update the PR description to maintain consistency on what has been removed from the PR

@himani2411 himani2411 merged commit f2883fe into aws:develop Jul 9, 2025
24 checks passed
@himani2411
Copy link
Contributor

@Allenz5 in follow up PR please add the integration tests as part of our daily integration test set (develop.yaml)

hgreebe pushed a commit to hgreebe/aws-parallelcluster that referenced this pull request Jul 11, 2025
…ioritized Allocation Strategy (aws#6865)

* [Subnet Prioritization] Add prioritized|capacity-optimized-prioritized to AllocationStrategy configuration; Add new configuration of enable_single_availability_zone;

Signed-off-by: Hanxuan Zhang <[email protected]>

* [Subnet Prioritization] Add test cases for instance allocation strategy validator

* [Subnet Prioritization] Update the default value and update policy of enable_single_availability_zone

* [Subnet Prioritization] Move AllocationStrategy Enum from pcluster.config.cluster_config to pcluster.config.common

* [Subnet Prioritization] Add validator and validator test for enable_single_availability_zone

* [Subnet Prioritization] Move AllocationStrategy Enum from cluster_config.py to common.py

* Revert "[Subnet Prioritization] Move AllocationStrategy Enum from cluster_config.py to common.py"

This reverts commit 13d8cfa.

* [Subnet Prioritization] Register enable_single_availability_zone_validator in cluster_config.py and add registration tests

* [Subnet Prioritization] Change default value of enable_single_availability_zone from False to None

* [Subnet Prioritization] Add enable_single_availability_zone parameter to slurm.full_config.snapshot.yaml

* [Subnet Prioritization] Fix format issues

* [Subnet Prioritization] Update CHANGELOG.md

* [Subnet Prioritization] Remove duplicated AllocationStrategy Enum

* [Subnet Prioritization] Remove duplicated EnableSingleAvailabilityZoneValidator registration

* [Subnet Prioritization] Update the failure message of InstancesAllocationStrategyValidator

* [Subnet Prioritization] Update enable_single_availability_zone_validator registration test

* [Subnet Prioritization] Update format

* [Subnet Prioritization] Fix EnableSingleAvailabilityZoneValidator

* [Subnet Prioritization] Fix format issue

* [Subnet Prioritization] Add integration test for subnet prioritization

* [Subnet Prioritization] Update integration test for subnet prioritization

* [Subnet Prioritization] Remove EnableSingleAvailabilityZone parameter from configuration

* [Subnet Prioritization] Update Integration Test

* [Subnet Prioritization] Fix format issues

* [Subnet Prioritization] Remove EnableSingleAvailabilityZone from integration test

* [Subnet Prioritization] Move AllocationStrategy Enum from common.py back to cluster_config.py

* [Subnet Prioritization] Update integration test and format

* [Subnet Prioritization] Remove custom hardcode settings from integration test config file

* [Subnet Prioritization] Update format in instances_validators.py

---------

Signed-off-by: Hanxuan Zhang <[email protected]>

**CHANGES**
- Ubuntu 20.04 is no longer supported.
- Support prioritized and capacity-optimized-prioritized Allocation Strategy
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

def _validate(self, compute_resource_name: str, capacity_type: Enum, allocation_strategy: Enum, **kwargs):
"""On-demand Capacity type only supports "lowest-price" allocation strategy."""
"""On-demand Capacity type only supports "lowest-price" and "prioritized" allocation strategy."""
valid_on_demand_allocation_strategy = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[CodeStyle] This is a constant dictionary, let's defined it in a common script, ideally where the constants about strategies are defined. There is not reason to redefine it every time this function gets called.

)
if (
capacity_type == cluster_config.CapacityType.SPOT
and allocation_strategy == cluster_config.AllocationStrategy.PRIORITIZED
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could take the chance here to generalize the check the same way we did for on demand, i,.e,. define in a constant the valid strategies for spot and check against that constant.

self._add_failure(
f"Compute Resource {compute_resource_name} is using a SPOT CapacityType but the "
f"Allocation Strategy specified is {allocation_strategy.value}. SPOT CapacityType can only use "
f"'{cluster_config.AllocationStrategy.LOWEST_PRICE.value}', "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to define the valid strategies for spot into a constant and concatenate those into the message

- Name: compute_resource2
InstanceType: c4.2xlarge
- Name: queue2
AllocationStrategy: "prioritized"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: double quotes are not required in yaml

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants