The problem
I'm following the snakebids tutorial and issue #209. The current way of passing arguments to bids() and expand() seem to be to use entities/wildcards found in the current BidsDataset as a base dict, and manually add a few components if needed. For example, from the tutorial
rule all:
input:
inputs['bold'].expand(
bids(
root='results',
fwhm='{fwhm}',
suffix='bold.nii.gz',
**inputs['bold'].wildcards
),
fwhm=config['fwhm'],
)
There are two possible issues:
- expand() depends only on the zip_lists attribute of a dataset.
input['bold'] which may have other attributes not needed (e.g. dataset root path), suggesting separating expand() function out of BidsComponentRow/BidsPartialComponent class as an independent function bids_expand(). With this in mind, we can change the above rule all to the following:
rule all:
input:
bids_expand(
bids(
root='results',
fwhm='{fwhm}',
suffix='bold.nii.gz',
**inputs['bold'].wildcards
),
fwhm=config['fwhm'],
**inputs['bold'].zip_lists
)
-
Interfaces like BidsDataset.drop() or BidsDataset.with_entities() may be equivalent to doing some more intuitive and simpler dictionary operations over entities or wildcards.
Assume we define a subclass MyDict of Python dict that supports the following operations:
- update or add entries (overwriting content of d1 with contents from d2) This is documented here
d1 | d2
- we can define a function that returns a dict copy with removed entries, or define a subclass of dict and support this with a subtract '-' operation override.
d - k # remove key k from dict d
d - [k1, k2…, kn] # subtracting an iterable of keys; i.e. remove n keys at once
- We can define another function or operation that returns a new dict with selected subset of keys
d & k
d & [k1, k2…, kn]
- One thing to note is all entities dictionary etc. have strings as keys. To avoid typing quotes " " when initializing a dict[str, any], we can use dict() instead of {}, i.e.
d = {'subject': '{subject}'} becomes d = dict(subject='{subject}')
To make code analysis easier, above operations 2-3 should be implemented as pure functions (with no side effects and solely depends on its input parameters), instead of modifying the dictionary caller.
This would allow reuse of variables, for example if A and B are two BIDS datasets on paths 'path/to/a' and 'path/to/b', we can define
zipdict = generated_inputs['some_name'].zipdict
"""
zipdict may be something like
{
'subject': ['s1', 's2'],
'sample': ['sample1', 'sample2']
}
"""
dictA = zipdict | {'root': 'path/to/a'}
# assume we are averaging over samples in our analysis, we don't need them in the output path, so we can subtract away the 'sample' key
dictB = (zipdict | {'root': 'path/to/b'}) - 'sample'
rule R1:
input:
path = bids_expandpath(**dictA)
…
output:
path = bids_expandpath(**dictB)
Similarly, if we want to form wildcard paths for both input and output, we may write
entitydict = {
'datatype': 'micr',
'stain': '{stain}'
} # missing some keys, but used as an example
dictA = entitydict | {'suffix': 'source.nii'}
dictB = entitydict | {'suffix': 'target.nii'}
rule R2:
input:
path = bids(**dictA)
…
output:
path = bids(**dictB)
Proposed Enhancement
I would like to propose a different way to pass parameters to bids() and expand() functions, as described above.
To do so, we need to move expand() out of BidsComponentRow/BidsPartialComponent class, and add a dict subclass to support parameter concatenation, overwrite, keyset removal and subsetting.
Environment
- The version of
snakebids is the main branch as of today
The problem
I'm following the snakebids tutorial and issue #209. The current way of passing arguments to bids() and expand() seem to be to use entities/wildcards found in the current BidsDataset as a base dict, and manually add a few components if needed. For example, from the tutorial
There are two possible issues:
input['bold']which may have other attributes not needed (e.g. dataset root path), suggesting separating expand() function out of BidsComponentRow/BidsPartialComponent class as an independent function bids_expand(). With this in mind, we can change the above rule all to the following:Interfaces like BidsDataset.drop() or BidsDataset.with_entities() may be equivalent to doing some more intuitive and simpler dictionary operations over entities or wildcards.
Assume we define a subclass MyDict of Python dict that supports the following operations:
d1 | d2d - k # remove key k from dict dd - [k1, k2…, kn] # subtracting an iterable of keys; i.e. remove n keys at onced & kd & [k1, k2…, kn]d = {'subject': '{subject}'} becomes d = dict(subject='{subject}')To make code analysis easier, above operations 2-3 should be implemented as pure functions (with no side effects and solely depends on its input parameters), instead of modifying the dictionary caller.
This would allow reuse of variables, for example if A and B are two BIDS datasets on paths 'path/to/a' and 'path/to/b', we can define
Similarly, if we want to form wildcard paths for both input and output, we may write
Proposed Enhancement
I would like to propose a different way to pass parameters to bids() and expand() functions, as described above.
To do so, we need to move expand() out of BidsComponentRow/BidsPartialComponent class, and add a dict subclass to support parameter concatenation, overwrite, keyset removal and subsetting.
Environment
snakebidsis the main branch as of today