Combine MIT Engaging `dandi/001` and `dandi/002` storage directories into a single virtual file system/namespace? #8

kabilar · 2025-01-30T21:20:50Z

Hi @satra, based on the email discussions with the ORCD team from September 2024, each storage server has 1.1 PiB and DANDI requested 1.5 PB so they had to split up the DANDI space between the two storage servers, as shown below:

200T /orcd/data/linc/001
620T /orcd/data/dandi/002
460T /orcd/data/satra/002
880T /orcd/data/dandi/001

Michel had proposed a couple of options to create a single virtual file system/namespace for the DANDI storage.

Should we pursue these options or just try to get s3invsync to work with multiple target directories (i.e. /orcd/data/dandi/001 and /orcd/data/dandi/002)? I am inclined to the latter option since we have more control over the timeline, but perhaps @jwodder and @yarikoptic have a preference here? Thanks.

The text was updated successfully, but these errors were encountered:

yarikoptic · 2025-01-30T22:35:59Z

single virtual filesystem would be much better -- I do not think s3invsync should get into business of "volume management".

satra · 2025-01-31T00:10:41Z

@kabilar - check in with michel about the virtual layer. i'm not sure that's an easy solution.

i agree that s3invsync shouldn't be in the business of volume management. however, it should be able to take a set of paths and spill over if it detects out of space in any location and go to the next location to continue. users could specify a path to store the index, but the downloaded objects could be across filesystems.

kabilar · 2025-01-31T21:45:36Z

@kabilar - check in with michel about the virtual layer. i'm not sure that's an easy solution.

Just sent an email.

yarikoptic · 2025-01-31T23:50:09Z

however, it should be able to take a set of paths and spill over if it detects out of space in any location and go to the next location to continue

it isn't as easy as simply generating a single file, and going to the next "part" when first one is "full". there is no "objects" -- s3invsync is replicating original hierarchy of keys in the bucket and was created with the idea of retaining such hierarchy and adjusting the state "in place".

I guess it would be possible to design logic of inspecting/dealing with multiple leading paths, but this would likely have many negative impacts on performance etc.

kabilar · 2025-02-04T17:48:20Z

I can certainly appreciate that this would be additional work and could effect performance. Given that its going to cost 60k + 6k recurring after the first year, perhaps it is most cost effective to implement this feature in s3invsync.

@yarikoptic @jwodder Can we map out what it would take to implement this feature? And then we can make a decision.

jwodder · 2025-02-04T18:34:54Z

@kabilar Exactly how do you want this feature to behave? The most obvious option would be to download files to the first filesystem until it's full, then move on to the next file system and so forth, all the while retaining intermediate directory components in paths (e.g., two adjacent keys foo/bar/baz.txt and foo/bar/quux.txt could end up in different filesystems at /orcd/data/linc/001/foo/bar/baz.txt and /orcd/data/dandi/002/foo/bar/quux.txt).

kabilar · 2025-02-05T04:19:57Z

@kabilar Exactly how do you want this feature to behave? The most obvious option would be to download files to the first filesystem until it's full, then move on to the next file system and so forth, all the while retaining intermediate directory components in paths (e.g., two adjacent keys foo/bar/baz.txt and foo/bar/quux.txt could end up in different filesystems at /orcd/data/linc/001/foo/bar/baz.txt and /orcd/data/dandi/002/foo/bar/quux.txt).

Hi @jwodder, this plan makes sense to me.

How and how often would you check for the space available on the filesystem? With parallel jobs this could get a bit tricky to make sure there is enough space remaining as the filesystem approaches capacity. Additionally, we can make it so that s3invsync is the only tool used to save data to /orcd/data/dandi/001/, but other DANDI users (e.g. Jeremy) may end up concurrently using /orcd/data/dandi/002/ for DANDI-related projects.

jwodder · 2025-02-05T11:27:18Z

@kabilar

How and how often would you check for the space available on the filesystem?

I was thinking of just checking whether any write failures had an ErrorKind of StorageFull.

kabilar · 2025-02-05T15:22:56Z

I was thinking of just checking whether any write failures had an ErrorKind of StorageFull.

Thanks. I think that would be fine for our use case.

yarikoptic · 2025-02-05T16:28:26Z

some thoughts:

our .s3invsync.versions.json files might need to be removed in prior file systems and retained only in the "latest" to avoid conflicts/ambiguity
then move on to the next file system should mean that even if prior file system(s) got some space freed-up, we would need to operate on that "next" file system
I feel that we would need an "abstraction" layer for listing/manipulation of files/folders filesystem so it would abstract away having multiple leading folders and take care about
- listing multiple locations (what if present in multiple somehow?),
- deleting in a corresponding where it is present,
- creating only in the latest (while potentially replacing/removing in priors)

altogether - sounds feasible, but I have fears of hidden obstacles coming up often, which could have been avoided by relying on proper volume management at filesystem level -- could there may be some LVM or another be put over those storage storage volumes?

kabilar changed the title ~~Combine MIT Engaging dandi/001 and dandi/002 storage durectories into a single virtual file system/namespace?~~ Combine MIT Engaging dandi/001 and dandi/002 storage directories into a single virtual file system/namespace? Jan 30, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Combine MIT Engaging `dandi/001` and `dandi/002` storage directories into a single virtual file system/namespace? #8

Combine MIT Engaging `dandi/001` and `dandi/002` storage directories into a single virtual file system/namespace? #8

kabilar commented Jan 30, 2025 •

edited

Loading

yarikoptic commented Jan 30, 2025

satra commented Jan 31, 2025

kabilar commented Jan 31, 2025

yarikoptic commented Jan 31, 2025

kabilar commented Feb 4, 2025

jwodder commented Feb 4, 2025

kabilar commented Feb 5, 2025

jwodder commented Feb 5, 2025

kabilar commented Feb 5, 2025

yarikoptic commented Feb 5, 2025

Combine MIT Engaging dandi/001 and dandi/002 storage directories into a single virtual file system/namespace? #8

Combine MIT Engaging dandi/001 and dandi/002 storage directories into a single virtual file system/namespace? #8

Comments

kabilar commented Jan 30, 2025 • edited Loading

yarikoptic commented Jan 30, 2025

satra commented Jan 31, 2025

kabilar commented Jan 31, 2025

yarikoptic commented Jan 31, 2025

kabilar commented Feb 4, 2025

jwodder commented Feb 4, 2025

kabilar commented Feb 5, 2025

jwodder commented Feb 5, 2025

kabilar commented Feb 5, 2025

yarikoptic commented Feb 5, 2025

Combine MIT Engaging `dandi/001` and `dandi/002` storage directories into a single virtual file system/namespace? #8

Combine MIT Engaging `dandi/001` and `dandi/002` storage directories into a single virtual file system/namespace? #8

kabilar commented Jan 30, 2025 •

edited

Loading