You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: CHANGELOG.md
+23-12Lines changed: 23 additions & 12 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,18 +7,23 @@ This file is used to list changes made in each version of the AWS ParallelCluste
7
7
------
8
8
9
9
**ENHANCEMENTS**
10
-
-Add support for P6e-GB200 instances. ParallelCluster sets up Slurm topology plugin to handle P6e-GB200 UltraServers. See limitations section for important additional setup requirements.
11
-
-Add support for P6-B200 instances for all OSs except AL2.
10
+
-Include drivers for P6e-GB200 and P6-B200 instances. ParallelCluster sets up Slurm topology plugin to handle P6e-GB200 UltraServers. See limitations section for important additional setup requirements.
11
+
-Support `prioritized` and `capacity-optimized-prioritized` Allocation Strategy. This allows users to prioritize subnets for instance placement to optimize costs and performance.
12
12
- Add `build-image` support for Amazon Linux 2023 AMIs based on kernel 6.12 (in addition to 6.1).
13
+
- Support DCV on Amazon Linux 2023.
14
+
- Echo chef-client logs in the instance console when a node fails to bootstrap. This helps with investigating bootstrap failures in cases CloudWatch logs are not available.
13
15
14
16
**LIMITATIONS**
15
17
- P6e-GB200 instances are only tested on Amazon Linux 2023, Ubuntu 22.04 and Ubuntu 24.04.
16
-
- Using IMEX on P6e-GB200 requires additional setup. Please refer to <PLACE_HOLDER for the tutorial link>.
18
+
- Using IMEX on P6e-GB200 requires additional setup. Please refer to the dedicated tutorial in our public documentation.
19
+
- P6-B200 instances are only tested on Amazon Linux 2023, RHEL9, Ubuntu 22.04 and Ubuntu 24.04.
17
20
18
21
**CHANGES**
19
-
- Install nvidia-imex for all OSs except AL2.
20
-
- Remove `berkshelf`. All cookbooks are local and do not need `berkshelf` dependency management.
22
+
- Install nvidia-imex for all OSs except Amazon Linux 2.
21
23
- Remove `UnkillableStepTimeout` from slurm.conf and let slurm set this value.
24
+
- Upgrade Python runtime used by Lambda functions to Python 3.12 (from 3.9). See Lambda Documentation for important information about Python 3.9 EOL: https://docs.aws.amazon.com/lambda/latest/dg/lambda-runtimes.html
25
+
- Support encryption of EFS file system used for the head node internal shared storage via a new configuration parameter `HeadNode/SharedStorageEfsSettings/Encrypted`
26
+
- Add validator that warns against using non GPU instances with DCV.
22
27
- Upgrade Slurm to version 24.11.6 (from 24.05.8).
23
28
- Upgrade EFA installer to 1.43.2 (from 1.41.0).
24
29
- Efa-driver: efa-2.17.2-1
@@ -28,20 +33,26 @@ This file is used to list changes made in each version of the AWS ParallelCluste
28
33
- Rdma-core: rdma-core-58.0-1
29
34
- Open MPI: openmpi40-aws-4.1.7-2 and openmpi50-aws-5.0.6-11
30
35
- Upgrade Cinc Client to version 18.4.12 (from 18.2.7).
31
-
- Upgrade NVIDIA driver to version 570.172.08 (from 570.86.15) for all OSs except AL2.
32
-
- Upgrade CUDA Toolkit to version 12.8.1 (from 12.8.0) for all OSs except AL2.
33
-
- Upgrade DCGM to version 4.4.1 (from 3.3.6) for all OSs except AL2.
34
-
- Upgrade Python to 3.12.11 (from 3.12.8) for all OSs except AL2.
35
-
- Upgrade Python to 3.9.23 (from 3.9.20) for AL2.
36
+
- Upgrade NVIDIA driver to version 570.172.08 (from 570.86.15) for all OSs except Amazon Linux 2.
37
+
- Upgrade CUDA Toolkit to version 12.8.1 (from 12.8.0) for all OSs except Amazon Linux 2.
38
+
- Upgrade DCGM to version 4.4.1 (from 3.3.6) for all OSs except Amazon Linux 2.
39
+
- Upgrade Python to 3.12.11 (from 3.12.8) for all OSs except Amazon Linux 2.
40
+
- Upgrade Python to 3.9.23 (from 3.9.20) for Amazon Linux 2.
36
41
- Upgrade Intel MPI Library to 2021.16.0 (from 2021.13.1).
37
42
- Upgrade DCV to version 2024.0-19030.
38
43
- Upgrade the official ParallelCluster Amazon Linux 2023 AMIs to kernel 6.12 (from 6.1).
39
44
40
45
**BUG FIXES**
41
-
- Fix a race condition in CloudWatch Agent startup that could cause nodes bootstrap failures.
42
-
- Fix cluster id mismatch issue by deleting the file `/var/spool/slurm.state/clustername` before configuring Slurm accounting.
46
+
- Prevent `build-image` stack deletion failures by deploying a global role that automatically deletes the `build-image` stack after images either succeed or fail the build.
47
+
The role is meant to exist even after the stack has been deleted. See https://github.com/aws/aws-parallelcluster/issues/5914.
48
+
- Fix an issue where Security Group validation failed when a rule contained both IPv4 ranges (IpRanges) and security group references (UserIdGroupPairs).
49
+
- Fix `build-image` failure on Rocky 9, occurring when the parent image does not ship the latest kernel version on the latest Rocky minor version.
50
+
- Fix cluster id mismatch issue which causes cluster update failures when slurm accounting is used.
51
+
- Fix a race condition in CloudWatch Agent startup that could cause node bootstrap failures.
43
52
44
53
**DEPRECATIONS**
54
+
- The configuration parameter `LoginNodes/Pools/Ssh/KeyName` has been deprecated, and it will be removed in future releases. The CLI now returns a warning message when it is used in the cluster configuration.
55
+
See https://github.com/aws/aws-parallelcluster/issues/6811.
0 commit comments