This repository contains the code and resources for the "Scaling LLM deployments with Arm and Google Kubernetes Engine" workshop, available on Qwiklabs.
Feel free to open issues or pull requests if you have questions or suggestions.
In this workshop, you will learn how to deploy and scale Large Language Models (LLMs) using Arm-based nodes on Google Kubernetes Engine (GKE). The hands-on labs guide you through setting up your environment, deploying models, and optimizing for performance and cost.
Access the workshop on Qwiklabs: Workshop Link