You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'll put some notes here on the existing test SSH stucture, and what it will be nice to test next.
Existing:
There are two sets of file transfer tests. test_transfer_file.py is the original set of tests that mostly make directories of different data_types and check they are uploaded / downloaded properly. These do not test the newer keys for non-data_type files e.g. "all_non_sub".
test_ssh_file_transfer.py has extended tests thats test files and folders, with lots of permutations of different sub/ses/datatype (including all the new keys (e.g. all_non_sub). This has options for testing over the local filesystem which is always on. It also has option for testing over SSH (and there are also tests in test_ssh_setup.py that test SSH setup). The data transfer tests do not require a password, but the SSH setup tests do. Currently all SSH test settings are stored in a conftest.py, including a path to a txt file containg the password. However, this is bad idea and is probably safest to remove, and just forgot the couple of tests that check SSH key setup.
The idea is that SSH tests are always off on CI, and we can run them just on the dedicated PC we will setup.
As for tests on real data, I had a look into the memory blocking thing for linux but I don't think there is a simple way to make it cross-platform. As such started to put together a fake experimental project folder with real data at: (ceph) X:\neuroinformatics\scratch\datashuttle_tests\real_data/). We can extend this, and then setup real-life test around it that can check long-latency transfers.
Currently the default setting is that we will never overwrite data (there is a config for this can be changed). One problem
is that if somehow a transfer fails halfway, a corrupt version of the file will exist on remote, but will never be updated
again, because it will not be overwritten. This shoudl still warn the user, but need to explictly test what happens in
this case, that the warning is clear and see if there are any nice solutions.
currently all test are from one project (local) to another (remote). It will be nice to simulate 2-3 local projects
and push them all to the remote at the same time. Not 100% sure what rlcone will do in this situation.
currently rawdata only tested. WHen extend to derivatives will need to extend the tests
switching between local and SSH transfer methods in a single repo (currently these are isolateed tests)
SSH tests are very slow on windows due to the way they are checkde through mounted filesystem (see test_ssh_file_transfer.py pathtable_and_project()
The text was updated successfully, but these errors were encountered:
The idea is that SSH tests are always off on CI, and we can run them just on the dedicated PC we will setup.
If we can, it would be great to get these set up as local GH actions runners and integrate with CI.
Currently the default setting is that we will never overwrite data (there is a config for this can be changed). One problem
is that if somehow a transfer fails halfway, a corrupt version of the file will exist on remote, but will never be updated
again, because it will not be overwritten. This shoudl still warn the user, but need to explictly test what happens in
this case, that the warning is clear and see if there are any nice solutions.
One way address this is to mark datasets as either OK (leave e.g. a .transfer.complete empty file after transfer) or not OK (write a .transfer.inprogress file at the start, and clean up at the end). datashuttle can read these and ask the user what they'd like to do. Although, does rclone leave a corrupt version if transfer is interrupted? I think e.g. rsync writes to a temp file until transfer is complete.
currently rawdata only tested. WHen extend to derivatives will need to extend the tests
Might be worth thinking about this sooner rather than later in the context of SWC-BIDS and datashuttle.\
I've not tested it yet, but it seems rclone will store incomplete transfer to meta-data and re-upload the next time, according to here and here. However, this doesn't seem very well documented. I've opened a new forum question here asking for a bit more detail and checking if I've missed any documentation anywhere.
That would be cool to integrate the SSH tests with local CI runners. Yes I think adding the derivatives option is high priority next step, it actually won't take much to implement so could be worth doing before beta release.
I'll put some notes here on the existing test SSH stucture, and what it will be nice to test next.
Existing:
There are two sets of file transfer tests. test_transfer_file.py is the original set of tests that mostly make directories of different data_types and check they are uploaded / downloaded properly. These do not test the newer keys for non-data_type files e.g. "all_non_sub".
test_ssh_file_transfer.py has extended tests thats test files and folders, with lots of permutations of different sub/ses/datatype (including all the new keys (e.g. all_non_sub). This has options for testing over the local filesystem which is always on. It also has option for testing over SSH (and there are also tests in test_ssh_setup.py that test SSH setup). The data transfer tests do not require a password, but the SSH setup tests do. Currently all SSH test settings are stored in a conftest.py, including a path to a txt file containg the password. However, this is bad idea and is probably safest to remove, and just forgot the couple of tests that check SSH key setup.
The idea is that SSH tests are always off on CI, and we can run them just on the dedicated PC we will setup.
As for tests on real data, I had a look into the memory blocking thing for linux but I don't think there is a simple way to make it cross-platform. As such started to put together a fake experimental project folder with real data at: (ceph) X:\neuroinformatics\scratch\datashuttle_tests\real_data/). We can extend this, and then setup real-life test around it that can check long-latency transfers.
Things to test next:
but we need to test our end (notes: https://stackoverflow.com/questions/18601828/python-block-network-connections-for-testing-purposes, but this might be just for dropping access to connection for python not entire machine)
is that if somehow a transfer fails halfway, a corrupt version of the file will exist on remote, but will never be updated
again, because it will not be overwritten. This shoudl still warn the user, but need to explictly test what happens in
this case, that the warning is clear and see if there are any nice solutions.
and push them all to the remote at the same time. Not 100% sure what rlcone will do in this situation.
The text was updated successfully, but these errors were encountered: