Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better support for hierachrical stores and nested urls #36

Closed
manzt opened this issue Feb 17, 2020 · 3 comments · Fixed by #42
Closed

Better support for hierachrical stores and nested urls #36

manzt opened this issue Feb 17, 2020 · 3 comments · Fixed by #42

Comments

@manzt
Copy link
Collaborator

manzt commented Feb 17, 2020

Something weird is going on when trying to interact with stores specified in nested directories. I think this has something to do with how string concatenation is done in zarr.js.

import { openArray } from 'zarr';
const config = {
    store: 'http://localhost:8000',
    path: 'dummy_data.zarr', // data served directly from data/
}
const z = await openArray(config);
// No errors
import { openArray } from 'zarr';
const config = {
    store: 'http://localhost:8000/data',
    path: 'dummy_data.zarr',
}
const z = await openArray(config);
// Error: array not found at path dummy_data.zarr

I see the requests are being made to 'http://localhost:8000/dummy_data.zarr', dropping the data/.

Second, do we have a general open method like in zarr? Something like:

// group or array located at '/'
const zarrStore = z.open('http://localhost:8000/my_data.zarr'); 
const group = zarrStore.some_group;
console.log(group.info);

which allows for traversal of the store? I'd like to do something like this:

# file structure
└── dummy_data.zarr
   └── my_group
      ├── arr1
      ├── arr2
      └── arr3
z = open('dummy_data.zarr')
# <zarr.hierarchy.Group '/'>
for key in z.my_group.array_keys():
    print(key)
# arr1
# arr2
# arr3
for [key, arr] in z.my_group.arrays():
    print(arr)
# <zarr.core.Array '/my_group/arr1' (4, 36040, 52660) uint16>
# <zarr.core.Array '/my_group/arr2' (4, 18020, 26330) uint16>
# <zarr.core.Array '/my_group/arr' (4, 9010, 13165) uint16>
@manzt
Copy link
Collaborator Author

manzt commented Feb 17, 2020

Ideally "store" would generally specify the / of the dummy_data.zarr and then all paths would be relative to that. Meaning opening a nested array should look like the following:

const config = {
    store: 'http://localhost:8000/dummy_data.zarr',
    path: '/my_group/arr1'
}
const z = await openArray(config);

Edit: Upon further research, children are not listed in .zgroup and thus traversing a zarr store must be very difficult since the contents of /group cannot be listed.

@manzt
Copy link
Collaborator Author

manzt commented Feb 17, 2020

Even further research into zarr shows this chaining is achieved using the .get method.

import zarr
z = zarr.open('my_data.zarr')
arr = z.get('my_group').get('00') # z.get('mxif_pyramid/00') as well

Sadly this might be a bit ugly in javascript:

import { open } from 'zarr'
const z = zarr.open('http://localhost:8000/my_data.zarr');
const arr = await (await z.get('my_group')).get('arr1') 
// but hopefully users would do something like the following anyways
const arr =  await z.get('my_group/arr1') 

@gzuidhof
Copy link
Owner

gzuidhof commented Feb 18, 2020

Yes, I completely agree this isn't how it should be! There is a short note describing the issue here too.

I think this is the line that causes the problem: https://github.com/gzuidhof/zarr.js/blob/master/src/storage/httpStore.ts#L21. URL is just not a very good fit for it, we should use some simpler path concatenation method.

>>> new URL("my-item", "http://example.com/arr.zarr").href;
"http://example.com/my-item"
// Fail, it just completely removes arr.zarr

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants