-
Notifications
You must be signed in to change notification settings - Fork 315
[Subnet Prioritization] Support capacity-optimized-prioritized and prioritized Allocation Strategy #6865
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…d to AllocationStrategy configuration; Add new configuration of enable_single_availability_zone; Signed-off-by: Hanxuan Zhang <[email protected]>
… enable_single_availability_zone
…nfig.cluster_config to pcluster.config.common
…ingle_availability_zone
…fig.py to common.py
…ster_config.py to common.py" This reverts commit 13d8cfa.
…dator in cluster_config.py and add registration tests
…ility_zone from False to None
… to slurm.full_config.snapshot.yaml
...ster/templates/test_cluster_stack/test_cluster_config_limits/slurm.full_config.snapshot.yaml
Outdated
Show resolved
Hide resolved
I would suggest to add a title to the PR too |
…eValidator registration
…tionStrategyValidator
…tor registration test
Update the PR description to maintain consistency on what has been removed from the PR |
...working/test_cluster_networking/test_cluster_with_subnet_prioritization/pcluster.config.yaml
Outdated
Show resolved
Hide resolved
...working/test_cluster_networking/test_cluster_with_subnet_prioritization/pcluster.config.yaml
Outdated
Show resolved
Hide resolved
tests/integration-tests/tests/networking/test_cluster_networking.py
Outdated
Show resolved
Hide resolved
...working/test_cluster_networking/test_cluster_with_subnet_prioritization/pcluster.config.yaml
Outdated
Show resolved
Hide resolved
tests/integration-tests/tests/networking/test_cluster_networking.py
Outdated
Show resolved
Hide resolved
…ack to cluster_config.py
…ion test config file
@Allenz5 in follow up PR please add the integration tests as part of our daily integration test set (develop.yaml) |
…ioritized Allocation Strategy (aws#6865) * [Subnet Prioritization] Add prioritized|capacity-optimized-prioritized to AllocationStrategy configuration; Add new configuration of enable_single_availability_zone; Signed-off-by: Hanxuan Zhang <[email protected]> * [Subnet Prioritization] Add test cases for instance allocation strategy validator * [Subnet Prioritization] Update the default value and update policy of enable_single_availability_zone * [Subnet Prioritization] Move AllocationStrategy Enum from pcluster.config.cluster_config to pcluster.config.common * [Subnet Prioritization] Add validator and validator test for enable_single_availability_zone * [Subnet Prioritization] Move AllocationStrategy Enum from cluster_config.py to common.py * Revert "[Subnet Prioritization] Move AllocationStrategy Enum from cluster_config.py to common.py" This reverts commit 13d8cfa. * [Subnet Prioritization] Register enable_single_availability_zone_validator in cluster_config.py and add registration tests * [Subnet Prioritization] Change default value of enable_single_availability_zone from False to None * [Subnet Prioritization] Add enable_single_availability_zone parameter to slurm.full_config.snapshot.yaml * [Subnet Prioritization] Fix format issues * [Subnet Prioritization] Update CHANGELOG.md * [Subnet Prioritization] Remove duplicated AllocationStrategy Enum * [Subnet Prioritization] Remove duplicated EnableSingleAvailabilityZoneValidator registration * [Subnet Prioritization] Update the failure message of InstancesAllocationStrategyValidator * [Subnet Prioritization] Update enable_single_availability_zone_validator registration test * [Subnet Prioritization] Update format * [Subnet Prioritization] Fix EnableSingleAvailabilityZoneValidator * [Subnet Prioritization] Fix format issue * [Subnet Prioritization] Add integration test for subnet prioritization * [Subnet Prioritization] Update integration test for subnet prioritization * [Subnet Prioritization] Remove EnableSingleAvailabilityZone parameter from configuration * [Subnet Prioritization] Update Integration Test * [Subnet Prioritization] Fix format issues * [Subnet Prioritization] Remove EnableSingleAvailabilityZone from integration test * [Subnet Prioritization] Move AllocationStrategy Enum from common.py back to cluster_config.py * [Subnet Prioritization] Update integration test and format * [Subnet Prioritization] Remove custom hardcode settings from integration test config file * [Subnet Prioritization] Update format in instances_validators.py --------- Signed-off-by: Hanxuan Zhang <[email protected]>
|
||
**CHANGES** | ||
- Ubuntu 20.04 is no longer supported. | ||
- Support prioritized and capacity-optimized-prioritized Allocation Strategy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def _validate(self, compute_resource_name: str, capacity_type: Enum, allocation_strategy: Enum, **kwargs): | ||
"""On-demand Capacity type only supports "lowest-price" allocation strategy.""" | ||
"""On-demand Capacity type only supports "lowest-price" and "prioritized" allocation strategy.""" | ||
valid_on_demand_allocation_strategy = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[CodeStyle] This is a constant dictionary, let's defined it in a common script, ideally where the constants about strategies are defined. There is not reason to redefine it every time this function gets called.
) | ||
if ( | ||
capacity_type == cluster_config.CapacityType.SPOT | ||
and allocation_strategy == cluster_config.AllocationStrategy.PRIORITIZED |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could take the chance here to generalize the check the same way we did for on demand, i,.e,. define in a constant the valid strategies for spot and check against that constant.
self._add_failure( | ||
f"Compute Resource {compute_resource_name} is using a SPOT CapacityType but the " | ||
f"Allocation Strategy specified is {allocation_strategy.value}. SPOT CapacityType can only use " | ||
f"'{cluster_config.AllocationStrategy.LOWEST_PRICE.value}', " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to define the valid strategies for spot into a constant and concatenate those into the message
- Name: compute_resource2 | ||
InstanceType: c4.2xlarge | ||
- Name: queue2 | ||
AllocationStrategy: "prioritized" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: double quotes are not required in yaml
Description of changes
Tests
References
Checklist
develop
add the branch name as prefix in the PR title (e.g.[release-3.6]
).Please review the guidelines for contributing and Pull Request Instructions.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.