Skip to content

added new tool to scale up-down nodes on an instance group #708

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

paragao
Copy link

@paragao paragao commented Jun 5, 2025

No previous issue, new feature.

Added a new AWS Cloudformation template to the Architecture/Common folder which deploys a solution to scale up/down compute nodes on an instance group.

The template deploys an Amazon EventBridge rule that triggers an AWS Lambda lambda function to update the node count on an instance group. The EventBridge rule is based on a cron expression. There is one rule for scaling up and another for scaling down.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

KeitaW
KeitaW previously requested changes Jun 5, 2025
Copy link
Contributor

@KeitaW KeitaW left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

@paragao paragao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved file and updated README.md as requested.

Copy link
Contributor

@KeitaW KeitaW left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe the previous file still remains in the original directory

@@ -12,3 +12,4 @@ This template creates a S3 Bucket with all public access disabled. To deploy it,
This template deploys a stack to receive human-readable email notifications for HyperPod cluster status changes and node health events. See the [workshop page](https://catalog.workshops.aws/sagemaker-hyperpod/en-US/07-tips-and-tricks/26-event-bridge) for more details.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kindly remove this file + directory.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@KeitaW these files were already there and are not part of this PR.

If you want, I can open a new PR for moving the files that were originally there too, as those have not been modified. The reason is that other assets, such as workshops, might have links to those files and moving them will break these assets.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah okay, my bad.

@KeitaW KeitaW self-requested a review June 9, 2025 22:39
@KeitaW KeitaW dismissed their stale review June 9, 2025 22:40

The file location LGTM.

@KeitaW KeitaW requested review from amanshanbhag and nghtm June 9, 2025 22:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants