Open
Description
Checklist
- I've prepended issue tag with type of change: [bug]
- (If applicable) I've attached the script to reproduce the bug
- (If applicable) I've documented below the DLC image/dockerfile this relates to
- (If applicable) I've documented below the tests I've run on the DLC image
- I'm using an existing DLC image listed here: https://docs.aws.amazon.com/deep-learning-containers/latest/devguide/deep-learning-containers-images.html
- I've built my own container based off DLC (and I've attached the code used to build my own image)
Concise Description:
I want to be able to build PyTorch containers locally so that I can modify them and be sure that everything is running correctly (patching, etc).
Steps to reproduce, straight from the README
aws ecr get-login-password --region us-west-2 | docker login --username AWS --password-stdin $ACCOUNT_ID.dkr.ecr.us-west-2.amazonaws.com
python3 -m venv dlc
source dlc/bin/activate
pip install -r src/requirements.txt
pip install -e .
bash src/setup.sh pytorch
python src/main.py --buildspec pytorch/training/buildspec-2-5-sm.yml \
--framework pytorch \
--image_types training \
--device_types gpu \
--py_versions py3
Error message:
Traceback (most recent call last):
File "/path/to/deep-learning-containers/src/main.py", line 140, in <module>
main()
File "/path/to/deep-learning-containers/src/main.py", line 136, in main
image_builder(buildspec_file, image_types, device_types)
File "/path/to/deep-learning-containers/src/image_builder.py", line 378, in image_builder
patch_helper.initiate_multithreaded_autopatch_prep(
File "/path/to/deep-learning-containers/src/patch_helper.py", line 352, in initiate_multithreaded_autopatch_prep
run(f"aws s3 cp s3://patch-dlc {download_path} --recursive", hide=True)
File "/path/to/deep-learning-containers/dlc/lib/python3.11/site-packages/invoke/__init__.py", line 50, in run
return Context().run(command, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/path/to/deep-learning-containers/dlc/lib/python3.11/site-packages/invoke/context.py", line 104, in run
return self._run(runner, command, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/path/to/deep-learning-containers/dlc/lib/python3.11/site-packages/invoke/context.py", line 113, in _run
return runner.run(command, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/path/to/deep-learning-containers/dlc/lib/python3.11/site-packages/invoke/runners.py", line 395, in run
return self._run_body(command, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/path/to/deep-learning-containers/dlc/lib/python3.11/site-packages/invoke/runners.py", line 451, in _run_body
return self.make_promise() if self._asynchronous else self._finish()
^^^^^^^^^^^^^^
File "/path/to/deep-learning-containers/dlc/lib/python3.11/site-packages/invoke/runners.py", line 518, in _finish
raise UnexpectedExit(result)
invoke.exceptions.UnexpectedExit: Encountered a bad command exit code!
Command: 'aws s3 cp s3://patch-dlc /path/to/patch-dlc --recursive'
Exit code: 1
Do I need to be on EC2 to run this? Is it not possible for anyone other than AWS employees to access the s3://patch-dlc
bucket?
Metadata
Metadata
Assignees
Labels
No labels