|
| 1 | +--- |
| 2 | +title: "Deploy on AWS Elastic Kubernetes Engine" |
| 3 | +--- |
| 4 | + |
| 5 | +LLMstudio is available as a container service on [AWS Marketplace](https://aws.amazon.com/marketplace). |
| 6 | +This tutorial shows to deploy the containerized application on EKS. |
| 7 | + |
| 8 | + |
| 9 | + |
| 10 | +## Prerequisites |
| 11 | + |
| 12 | +In order to follow this tutorial you need to: |
| 13 | + |
| 14 | +- Have AWSMarketplaceFullAccess policy permission in your AWS account |
| 15 | +- Have a valid subscription to LLMstudio on AWS marketplace |
| 16 | + |
| 17 | + |
| 18 | +## Setup Prerequisite command-line tools |
| 19 | + |
| 20 | +1. Open AWS CloudShell |
| 21 | +2. Run the following commands to install the prerequisite command-line tools: |
| 22 | + |
| 23 | +```bash |
| 24 | +mkdir software \ |
| 25 | +&& cd software \ |
| 26 | +&& ARCH=amd64 PLATFORM=$(uname -s)_$ARCH && \ |
| 27 | +curl -sLO "https://github.com/eksctl-io/eksctl/releases/latest/download/eksctl_$PLATFORM.tar.gz" && \ |
| 28 | +tar -xzf eksctl_$PLATFORM.tar.gz -C /tmp && \ |
| 29 | +sudo mv /tmp/eksctl /usr/local/bin && \ |
| 30 | +rm eksctl_$PLATFORM.tar.gz && \ |
| 31 | +curl -o kubectl https://s3.us-west-2.amazonaws.com/amazon-eks/1.27.4/2023-08-16/bin/linux/amd64/kubectl && \ |
| 32 | +chmod +x kubectl && \ |
| 33 | +sudo mv kubectl /usr/local/bin && \ |
| 34 | +wget -q https://get.helm.sh/helm-v3.12.3-linux-amd64.tar.gz && \ |
| 35 | +tar -zxvf helm-v3.12.3-linux-amd64.tar.gz && \ |
| 36 | +sudo mv linux-amd64/helm /usr/local/bin/helm && \ |
| 37 | +rm -rf helm-v3.12.3-linux-amd64.tar.gz linux-amd64 && \ |
| 38 | +curl -L https://github.com/a8m/envsubst/releases/download/v1.2.0/envsubst-$(uname -s)-$(uname -m) -o envsubst && \ |
| 39 | +chmod +x envsubst && \ |
| 40 | +sudo mv envsubst /usr/local/bin && \ |
| 41 | +sudo yum install -y yum-utils && \ |
| 42 | +sudo yum-config-manager --add-repo https://rpm.releases.hashicorp.com/AmazonLinux/hashicorp.repo && \ |
| 43 | +sudo yum -y install terraform && \ |
| 44 | +sudo yum update -y && \ |
| 45 | +sudo amazon-linux-extras install -y docker \ |
| 46 | +&& echo $(nohup sudo dockerd > /tmp/dockerd.log 2>&1 &); |
| 47 | + |
| 48 | +``` |
| 49 | + |
| 50 | +## Pull Images from ECR |
| 51 | + |
| 52 | +When you subscribe to the product on AWS Marketplace, you'll be given a command to pull the images from ECR. |
| 53 | +After running the given commanf you'll get the ECR Repository URL and the image version, which you need to set as environment variables. |
| 54 | + |
| 55 | + ```bash |
| 56 | +export ECR_REPOSITORY=<ecrrepourl> |
| 57 | +export PRODUCT_VERSION=<version> |
| 58 | +``` |
| 59 | + |
| 60 | + |
| 61 | +## Create cluster |
| 62 | + |
| 63 | +1. Create a file called cluster.yaml with the following format: |
| 64 | + |
| 65 | +<Accordion title="cluster.yaml"> |
| 66 | + ```yaml cluster.yaml |
| 67 | +--- |
| 68 | +apiVersion: eksctl.io/v1alpha5 |
| 69 | +kind: ClusterConfig |
| 70 | + |
| 71 | +metadata: |
| 72 | + name: ${PRODUCT_NAME}-cluster |
| 73 | + region: ${AWS_REGION} |
| 74 | + version: "1.30" |
| 75 | + |
| 76 | +availabilityZones: ["${AWS_REGION}a", "${AWS_REGION}b"] |
| 77 | + |
| 78 | +fargateProfiles: |
| 79 | + - name: fp-default |
| 80 | + selectors: |
| 81 | + # All workloads in the "default" Kubernetes namespace will be |
| 82 | + # scheduled onto Fargate: |
| 83 | + - namespace: default |
| 84 | + # All workloads in the "kube-system" Kubernetes namespace will be |
| 85 | + # scheduled onto Fargate: |
| 86 | + - namespace: kube-system |
| 87 | + - namespace: mcp |
| 88 | +iam: |
| 89 | + withOIDC: true |
| 90 | + serviceAccounts: |
| 91 | + - metadata: |
| 92 | + name: ${PRODUCT_NAME}-sa |
| 93 | + namespace: mcp |
| 94 | + attachPolicy: |
| 95 | + Version: '2012-10-17' |
| 96 | + Statement: |
| 97 | + - Action: |
| 98 | + - "aws-marketplace:MeterUsage" |
| 99 | + - "aws-marketplace:RegisterUsage" |
| 100 | + - "license-manager:CheckoutLicense" |
| 101 | + - "license-manager:CheckInLicense" |
| 102 | + - "license-manager:ExtendLicenseConsumption" |
| 103 | + - "license-manager:GetLicense" |
| 104 | + Effect: Allow |
| 105 | + Resource: "*" |
| 106 | + - metadata: |
| 107 | + name: ${PRODUCT_NAME}-icsa |
| 108 | + namespace: kube-system |
| 109 | + attachPolicy: |
| 110 | + Version: '2012-10-17' |
| 111 | + Statement: |
| 112 | + - Effect: Allow |
| 113 | + Action: |
| 114 | + - acm:DescribeCertificate |
| 115 | + - acm:ListCertificates |
| 116 | + - acm:GetCertificate |
| 117 | + Resource: "*" |
| 118 | + - Effect: Allow |
| 119 | + Action: |
| 120 | + - ec2:AuthorizeSecurityGroupIngress |
| 121 | + - ec2:CreateSecurityGroup |
| 122 | + - ec2:CreateTags |
| 123 | + - ec2:DeleteTags |
| 124 | + - ec2:DeleteSecurityGroup |
| 125 | + - ec2:DescribeAccountAttributes |
| 126 | + - ec2:DescribeAddresses |
| 127 | + - ec2:DescribeInstances |
| 128 | + - ec2:DescribeInstanceStatus |
| 129 | + - ec2:DescribeInternetGateways |
| 130 | + - ec2:DescribeNetworkInterfaces |
| 131 | + - ec2:DescribeSecurityGroups |
| 132 | + - ec2:DescribeSubnets |
| 133 | + - ec2:DescribeTags |
| 134 | + - ec2:DescribeVpcs |
| 135 | + - ec2:ModifyInstanceAttribute |
| 136 | + - ec2:ModifyNetworkInterfaceAttribute |
| 137 | + - ec2:RevokeSecurityGroupIngress |
| 138 | + Resource: "*" |
| 139 | + - Effect: Allow |
| 140 | + Action: |
| 141 | + - elasticloadbalancing:AddListenerCertificates |
| 142 | + - elasticloadbalancing:AddTags |
| 143 | + - elasticloadbalancing:CreateListener |
| 144 | + - elasticloadbalancing:CreateLoadBalancer |
| 145 | + - elasticloadbalancing:CreateRule |
| 146 | + - elasticloadbalancing:CreateTargetGroup |
| 147 | + - elasticloadbalancing:DeleteListener |
| 148 | + - elasticloadbalancing:DeleteLoadBalancer |
| 149 | + - elasticloadbalancing:DeleteRule |
| 150 | + - elasticloadbalancing:DeleteTargetGroup |
| 151 | + - elasticloadbalancing:DeregisterTargets |
| 152 | + - elasticloadbalancing:DescribeListenerCertificates |
| 153 | + - elasticloadbalancing:DescribeListeners |
| 154 | + - elasticloadbalancing:DescribeLoadBalancers |
| 155 | + - elasticloadbalancing:DescribeLoadBalancerAttributes |
| 156 | + - elasticloadbalancing:DescribeRules |
| 157 | + - elasticloadbalancing:DescribeSSLPolicies |
| 158 | + - elasticloadbalancing:DescribeTags |
| 159 | + - elasticloadbalancing:DescribeTargetGroups |
| 160 | + - elasticloadbalancing:DescribeTargetGroupAttributes |
| 161 | + - elasticloadbalancing:DescribeTargetHealth |
| 162 | + - elasticloadbalancing:ModifyListener |
| 163 | + - elasticloadbalancing:ModifyLoadBalancerAttributes |
| 164 | + - elasticloadbalancing:ModifyRule |
| 165 | + - elasticloadbalancing:ModifyTargetGroup |
| 166 | + - elasticloadbalancing:ModifyTargetGroupAttributes |
| 167 | + - elasticloadbalancing:RegisterTargets |
| 168 | + - elasticloadbalancing:RemoveListenerCertificates |
| 169 | + - elasticloadbalancing:RemoveTags |
| 170 | + - elasticloadbalancing:SetIpAddressType |
| 171 | + - elasticloadbalancing:SetSecurityGroups |
| 172 | + - elasticloadbalancing:SetSubnets |
| 173 | + - elasticloadbalancing:SetWebACL |
| 174 | + Resource: "*" |
| 175 | + - Effect: Allow |
| 176 | + Action: |
| 177 | + - iam:CreateServiceLinkedRole |
| 178 | + - iam:GetServerCertificate |
| 179 | + - iam:ListServerCertificates |
| 180 | + Resource: "*" |
| 181 | + - Effect: Allow |
| 182 | + Action: |
| 183 | + - cognito-idp:DescribeUserPoolClient |
| 184 | + Resource: "*" |
| 185 | + - Effect: Allow |
| 186 | + Action: |
| 187 | + - waf-regional:GetWebACLForResource |
| 188 | + - waf-regional:GetWebACL |
| 189 | + - waf-regional:AssociateWebACL |
| 190 | + - waf-regional:DisassociateWebACL |
| 191 | + Resource: "*" |
| 192 | + - Effect: Allow |
| 193 | + Action: |
| 194 | + - tag:GetResources |
| 195 | + - tag:TagResources |
| 196 | + Resource: "*" |
| 197 | + - Effect: Allow |
| 198 | + Action: |
| 199 | + - waf:GetWebACL |
| 200 | + Resource: "*" |
| 201 | + - Effect: Allow |
| 202 | + Action: |
| 203 | + - shield:DescribeProtection |
| 204 | + - shield:GetSubscriptionState |
| 205 | + - shield:DeleteProtection |
| 206 | + - shield:CreateProtection |
| 207 | + - shield:DescribeSubscription |
| 208 | + - shield:ListProtections |
| 209 | + Resource: "*" |
| 210 | +cloudWatch: |
| 211 | + clusterLogging: |
| 212 | + enableTypes: ["*"] |
| 213 | + logRetentionInDays: 1 |
| 214 | + |
| 215 | +``` |
| 216 | +</Accordion> |
| 217 | + |
| 218 | + |
| 219 | + |
| 220 | + |
| 221 | +2. Run the following command to create an **EKS cluster**. |
| 222 | + |
| 223 | + |
| 224 | +```bash |
| 225 | +export AWS_REGION=<REGION> |
| 226 | +export PRODUCT_NAME=<PRODUCT_NAME> |
| 227 | +envsubst < cluster.yaml > $PRODUCT_NAME-cluster.yaml |
| 228 | +eksctl create cluster -f $PRODUCT_NAME-cluster.yaml |
| 229 | +``` |
| 230 | + |
| 231 | +## Wait for EKS Cluster |
| 232 | +- It takes **10-20 minutes** for the EKS cluster to be created. |
| 233 | +- You will see a prompt in the CLI when the cluster is ready. |
| 234 | + |
| 235 | +--- |
| 236 | + |
| 237 | +## Create Deployment File |
| 238 | + |
| 239 | +1. Create a file called deployment.yaml with the following format: |
| 240 | + |
| 241 | +<Accordion title="deployment.yaml"> |
| 242 | + |
| 243 | +```yaml |
| 244 | + |
| 245 | +--- |
| 246 | +apiVersion: apps/v1 |
| 247 | +kind: Deployment |
| 248 | +metadata: |
| 249 | + name: llmstudio-proxy |
| 250 | + namespace: mcp |
| 251 | + labels: |
| 252 | + app: llmstudio-proxy |
| 253 | +spec: |
| 254 | + replicas: 1 |
| 255 | + selector: |
| 256 | + matchLabels: |
| 257 | + app: llmstudio-proxy |
| 258 | + template: |
| 259 | + metadata: |
| 260 | + labels: |
| 261 | + app: llmstudio-proxy |
| 262 | + spec: |
| 263 | + serviceAccountName: ${PRODUCT_NAME}-sa |
| 264 | + volumes: |
| 265 | + - name: shared-data |
| 266 | + emptyDir: {} |
| 267 | + containers: |
| 268 | + - name: llmstudio-proxy |
| 269 | + image: $ECR_REPOSITORY:$PRODUCT_VERSION |
| 270 | + ports: |
| 271 | + - containerPort: 50001 |
| 272 | + env: |
| 273 | + - name: OPENAI_API_KEY |
| 274 | + value: $OPENAI_API_KEY |
| 275 | + # Add here the keys for other providers as needed |
| 276 | + |
| 277 | +--- |
| 278 | +apiVersion: v1 |
| 279 | +kind: Service |
| 280 | +metadata: |
| 281 | + name: llmstudio-proxy |
| 282 | + namespace: mcp |
| 283 | +spec: |
| 284 | + type: NodePort |
| 285 | + selector: |
| 286 | + app: llmstudio-proxy |
| 287 | + ports: |
| 288 | + - protocol: TCP |
| 289 | + port: 50001 |
| 290 | + targetPort: 50001 |
| 291 | + |
| 292 | + |
| 293 | +``` |
| 294 | + |
| 295 | + |
| 296 | +</Accordion> |
| 297 | + |
| 298 | + |
| 299 | + |
| 300 | +2. Add environment variables needed, e.g the API Keys for the models you need: |
| 301 | +```bash |
| 302 | + |
| 303 | +export OPENAI_API_KEY="...." |
| 304 | + |
| 305 | +``` |
| 306 | + |
| 307 | +3. Run the following command to create a deployment file for your container image with the environment variables set: |
| 308 | + |
| 309 | +```bash |
| 310 | +envsubst < deployment.yaml > $PRODUCT_NAME-deployment.yaml |
| 311 | +``` |
| 312 | + |
| 313 | +--- |
| 314 | + |
| 315 | +## Deploy Container Image |
| 316 | +Run the following command to deploy the container image into the EKS cluster: |
| 317 | + |
| 318 | +```bash |
| 319 | +kubectl apply -f $PRODUCT_NAME-deployment.yaml |
| 320 | +``` |
| 321 | + |
| 322 | +--- |
| 323 | + |
| 324 | +## Check Pod Status |
| 325 | +Once the container starts, check the pod status with the following command: |
| 326 | + |
| 327 | +```bash |
| 328 | +kubectl get pods -n mcp |
| 329 | +``` |
| 330 | + |
| 331 | +- Wait and re-run the previous command until the **STATUS** shows `Running`. |
| 332 | +- The output should look similar to this: |
| 333 | + |
| 334 | +```plaintext |
| 335 | +NAME READY STATUS RESTARTS AGE |
| 336 | +llmstudio-proxy 1/1 Running 0 79s |
| 337 | +``` |
| 338 | + |
| 339 | +--- |
| 340 | + |
| 341 | +## View Logs |
| 342 | +Run the following command to get the log output of the pod: |
| 343 | + |
| 344 | +```bash |
| 345 | +kubectl logs "$(kubectl get pods -n mcp --output name)" -c contract-app -n mcp |
| 346 | +``` |
| 347 | + |
| 348 | + |
| 349 | +You'll see the application server startup: |
| 350 | + |
| 351 | +```plaintext |
| 352 | +Running LLMstudio Proxy on http://0.0.0.0:50001 |
| 353 | +``` |
| 354 | + |
| 355 | +And that's it! You've deployed LLMstudio 🎉 |
0 commit comments