Fix sagemaker-entrypoint*
& remove SageMaker and Vertex from Dockerfile*
#699
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What does this PR do?
This PR fixes the AWS SageMaker entrypoints in
sagemaker-entrypoint.sh
andsagemaker-entrypoint-cuda-all.sh
, as those were missing theexec
when running thetext-embeddings-router
process meaning it was being assigned a PID other than 1, so that the signals as the graceful handling (Ctrl + C) were not being propagated into the process, hence not stopping the process.Besides that, this PR removes the AWS SageMaker stages from both
Dockerfile
(CPU) andDockerfile-cuda-all
(NVIDIA GPU) as it's not used, since those images are ported as-is into https://github.com/awslabs/llm-hosting-container/tree/main/huggingface/pytorch/tei/docker and re-built from scratch, so there's not a clear benefit of including the stage on the build here.Finally, this PR also removes the
BUILD_ARG
for Google Cloud Vertex AI, as well as the conditional if-else when building thetext-embeddings-router
inDockerfile-cuda-all
that was building the binary with the--features google
, which is not required as theDockerfile
is also ported as-is into https://github.com/huggingface/Google-Cloud-Containers/tree/main/containers/tei, meaning theBUILD_ARG
is not being used here.Before submitting
insta
snapshots?Who can review?
Dockerfile
andDockerfile-cuda-all
here, as those are ported and rebuilt for AWS anywayFinally, as per the Google Cloud stuff, I can verify from a technical standpoint as per the containers released on Google Cloud that this is not an issue, since those are ported and rebuilt elsewhere.