This file documents environment configurations used to run experiments on the cohort hardware.
All experiment files are stored in /disk/ocean/mfonseca
, which we refer to as EXPERIMENT_HOME
.
From EXPERIMENT_HOME
, download the version for Linux:
wget https://repo.continuum.io/miniconda/Miniconda2-latest-Linux-x86_64.sh
The installer will ask to confirm the default install location or to specify an alternate directory. Make sure to specify EXPERIMENT_HOME
.
The Anaconda installer will prepend Anaconda's location to PATH
in .bashrc
. However, when you login via ssh
this script is not run. To avoid having to source ~/.bashrc
every time, create a ~/.bash_profile
file with the following contents:
if [ -f ~/.bashrc ]; then
source ~/.bashrc
fi
More details here.
conda create --name tensorflow
Follow the instructions in the Installing with Anaconda section of TensorFlow docs.
Make sure you have the environment activated:
source activate tensorflow
Install TensorFlow via pip:
(tensorflow)$ pip install --ignore-installed --upgrade tfBinaryURL
You have to choose the tfBinaryURL to match the TensorFlow version required for your experiments. For example, to install version 1.8.0 with GPU support use this command:
(tensorflow)$ pip install --ignore-installed --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.8.0-cp34-cp34m-linux_x86_64.whl
Each TensorFlow version requires specific of CUDA and CuDNN versions (see a complete list here).
To choose specific versions of CUDA and CuDNN you need to configure the environment variables LD_LIBRARY_PATH
(CUDA) and DYLD_LIBRARY_PATH
(CuDNN). An easy way to set these variables is using Anaconda's activation/deactivation scripts (more details here). To create an activation script execute the following commands:
mkdir -p ~/<ANACONDA_HOME>/envs/<tensorflow_env>/etc/conda/activate.d
touch ~/<ANACONDA_HOME>/envs/<tensorflow_env>/etc/conda/activate.d/activate.sh
chmod +x ~/<ANACONDA_HOME>/envs/<tensorflow_env>/etc/conda/activate.d/activate.sh
vim ~/<ANACONDA_HOME>/envs/<tensorflow_env>/etc/conda/activate.d/activate.sh
Add the following contents (on the Cohort machines, CUDA_HOME
and CUDNN_HOME
will probably be in the /opt
folder):
#!/bin/sh
ORIGINAL_LD_LIBRARY_PATH=$LD_LIBRARY_PATH
ORIGINAL_DYLD_LIBRARY_PATH=$DYLD_LIBRARY_PATH
export LD_LIBRARY_PATH=/<CUDA_HOME>/lib64:/<CUDA_HOME>/extras/CUPTI/lib64:$LD_LIBRARY_PATH
export DYLD_LIBRARY_PATH=/CUDNN_HOME/lib:$DYLD_LIBRARY_PATH
And for the deactivation script:
mkdir -p ~/<ANACONDA_HOME>/envs/<tensorflow_env>/etc/conda/deactivate.d
touch ~/<ANACONDA_HOME>/envs/<tensorflow_env>/etc/conda/deactivate.d/deactivate.sh
chmod +x ~/<ANACONDA_HOME>/envs/<tensorflow_env>/etc/conda/deactivate.d/deactivate.sh
vim ~/<ANACONDA_HOME>/envs/<tensorflow_env>/etc/conda/deactivate.d/deactivate.sh
#!/bin/sh
export LD_LIBRARY_PATH=$ORIGINAL_LD_LIBRARY_PATH
unset ORIGINAL_LD_LIBRARY_PATH
export DYLD_LIBRARY_PATH=$ORIGINAL_DYLD_LIBRARY_PATH
unset ORIGINAL_DYLD_LIBRARY_PATH
From this stackoverflow question.
Ctrl+a
thenShift+h
. You can view the filescreenlog.0
while the program is still running.
If the log can’t be created, then try changing the screen window’s working directory:
Ctrl+a + :
and type for examplechdir /home/foobar/baz