-
Notifications
You must be signed in to change notification settings - Fork 133
added new tool to scale up-down nodes on an instance group #708
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
added new tool to scale up-down nodes on an instance group #708
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please move the entire files under
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved file and updated README.md as requested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe the previous file still remains in the original directory
@@ -12,3 +12,4 @@ This template creates a S3 Bucket with all public access disabled. To deploy it, | |||
This template deploys a stack to receive human-readable email notifications for HyperPod cluster status changes and node health events. See the [workshop page](https://catalog.workshops.aws/sagemaker-hyperpod/en-US/07-tips-and-tricks/26-event-bridge) for more details. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kindly remove this file + directory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@KeitaW these files were already there and are not part of this PR.
If you want, I can open a new PR for moving the files that were originally there too, as those have not been modified. The reason is that other assets, such as workshops, might have links to those files and moving them will break these assets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah okay, my bad.
No previous issue, new feature.
Added a new AWS Cloudformation template to the Architecture/Common folder which deploys a solution to scale up/down compute nodes on an instance group.
The template deploys an Amazon EventBridge rule that triggers an AWS Lambda lambda function to update the node count on an instance group. The EventBridge rule is based on a cron expression. There is one rule for scaling up and another for scaling down.
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.