- Train
python3 -m torchproject.train \
--train-data data/train_manifest.csv \
--test-data data/test_manifest.csv \
--config config.yaml \
--batch-size 50 \
--epochs 15 \
--log-step 5 \
--comment "my-train"
- Train with no saving
python3 -m torchproject.train \
--train-data data/train_manifest.csv \
--test-data data/test_manifest.csv \
--config config.yaml \
--batch-size 50 \
--epochs 15 \
--no-save
- Train specific experiment with comment
python3 -m torchproject.train \
--train-data data/train_manifest.csv \
--test-data data/test_manifest.csv \
--config config.yaml \
--batch-size 50 \
--epochs 15 \
--experiment 'my_experiment' \
--comment 'example-run'
- Train experiment with manager
4.1 Train experiment with MLFlow
python3 -m torchproject.train \
--train-data data/train_manifest.csv \
--test-data data/test_manifest.csv \
--config config.yaml \
--batch-size 50 \
--epochs 15 \
--experiment 'my_experiment' \
--mlflow
NOTE: this experiment must be created in MLFlow Server
NOTE: add-v $PWD/dev/mlflow/data:/mlflow/mlruns
to running container which is used for train.py and test.py
4.2. Train experiment with ClearML
python3 -m torchproject.train \
--train-data data/train_manifest.csv \
--test-data data/test_manifest.csv \
--config config.yaml \
--batch-size 50 \
--epochs 15 \
--experiment 'my_experiment' \
--clearml
NOTE: Credentials (access and secret keys) must be created in UI. And
/home/<user>/clearml.conf
must be defined:
api {
web_server: http://localhost:8080
api_server: http://localhost:8008
files_server: http://localhost:8081
credentials {
"access_key" = <ACCESS_KEY>
"secret_key" = <SECRET_KEY>
}
}
4.3. Train experiment with MLFlow and Tensorboard
python3 -m torchproject.train \
--train-data data/train_manifest.csv \
--test-data data/test_manifest.csv \
--config config.yaml \
--batch-size 50 \
--epochs 15 \
--experiment 'my_experiment' \
--mlflow \
--tensorboard
4.4. Train experiment with ClearML and Tensorboard
python3 -m torchproject.train \
--train-data data/train_manifest.csv \
--test-data data/test_manifest.csv \
--config config.yaml \
--batch-size 50 \
--epochs 15 \
--experiment 'my_experiment' \
--clearml \
--tensorboard
- Test
python3 -m torchproject.test \
--data data/test_manifest.csv \
--config config.yaml \
--batch-size 50 \
--experiment 'my_experiment' \
--log-step 1 \
--comment "my-test"
- Test with no saving
python3 -m torchproject.test \
--data data/test_manifest.csv \
--config config.yaml \
--batch-size 50 \
--epochs 15 \
--no-save
- Test with reference to train experiment run
python3 -m torchproject.test \
--data data/test_manifest.csv \
--config config.yaml \
--batch-size 50 \
--experiment 'my_experiment' \
--run 3
NOTE: 'my_experiment/train/003*' must be created
- Test with reference to train experiment run and specific weights
python3 -m torchproject.test \
--data data/test_manifest.csv \
--config config.yaml \
--batch-size 50 \
--experiment 'my_experiment' \
--run 3 \
--weights 5.pt
- Test with manager
11.1. Test with MLFlow
python3 -m torchproject.test \
--data data/test_manifest.csv \
--config config.yaml \
--batch-size 50 \
--experiment 'my_experiment' \
--run 3 \
--weights 5.pt \
--mlflow
11.2. Test with ClearML
python3 -m torchproject.test \
--data data/test_manifest.csv \
--config config.yaml \
--batch-size 50 \
--experiment 'my_experiment' \
--run 3 \
--weights 5.pt \
--clearml
NOTE: make things described in 4.2
11.3. Test with MLFlow and Tensorboard
python3 -m torchproject.test \
--data data/test_manifest.csv \
--config config.yaml \
--batch-size 50 \
--experiment 'my_experiment' \
--run 3 \
--weights 5.pt \
--mlflow \
--tensorboard
11.4. Test with ClearML and Tensorboard
python3 -m torchproject.test \
--data data/test_manifest.csv \
--config config.yaml \
--batch-size 50 \
--experiment 'my_experiment' \
--run 3 \
--weights 5.pt \
--clearml \
--tensorboard
- Generate Tensorboard logs from metrics in csv
python3 -m torchproject.utils -r 'my_experiment/test/001'
NOTE: 'my_experiment/test/001*' must be created