Skip to content

Using dictionary operations for bids() and expand() #437

@Karl5766

Description

@Karl5766

The problem

I'm following the snakebids tutorial and issue #209. The current way of passing arguments to bids() and expand() seem to be to use entities/wildcards found in the current BidsDataset as a base dict, and manually add a few components if needed. For example, from the tutorial

rule all:
    input:
        inputs['bold'].expand(
            bids(
                root='results',
                fwhm='{fwhm}',
                suffix='bold.nii.gz',
                **inputs['bold'].wildcards
            ),
            fwhm=config['fwhm'],
        )

There are two possible issues:

  1. expand() depends only on the zip_lists attribute of a dataset. input['bold'] which may have other attributes not needed (e.g. dataset root path), suggesting separating expand() function out of BidsComponentRow/BidsPartialComponent class as an independent function bids_expand(). With this in mind, we can change the above rule all to the following:
rule all:
    input:
        bids_expand(
            bids(
                root='results',
                fwhm='{fwhm}',
                suffix='bold.nii.gz',
                **inputs['bold'].wildcards
            ),
            fwhm=config['fwhm'],
            **inputs['bold'].zip_lists
        )
  1. Interfaces like BidsDataset.drop() or BidsDataset.with_entities() may be equivalent to doing some more intuitive and simpler dictionary operations over entities or wildcards.
    Assume we define a subclass MyDict of Python dict that supports the following operations:

    1. update or add entries (overwriting content of d1 with contents from d2) This is documented here
      d1 | d2
    2. we can define a function that returns a dict copy with removed entries, or define a subclass of dict and support this with a subtract '-' operation override.
      d - k # remove key k from dict d
      d - [k1, k2…, kn] # subtracting an iterable of keys; i.e. remove n keys at once
    3. We can define another function or operation that returns a new dict with selected subset of keys
      d & k
      d & [k1, k2…, kn]
    4. One thing to note is all entities dictionary etc. have strings as keys. To avoid typing quotes " " when initializing a dict[str, any], we can use dict() instead of {}, i.e.
      d = {'subject': '{subject}'} becomes d = dict(subject='{subject}')

To make code analysis easier, above operations 2-3 should be implemented as pure functions (with no side effects and solely depends on its input parameters), instead of modifying the dictionary caller.

This would allow reuse of variables, for example if A and B are two BIDS datasets on paths 'path/to/a' and 'path/to/b', we can define

	zipdict = generated_inputs['some_name'].zipdict
	"""
	zipdict may be something like 
	{
	    'subject': ['s1', 's2'],
	    'sample': ['sample1', 'sample2']
	}
	"""
	
	dictA = zipdict | {'root': 'path/to/a'}
	# assume we are averaging over samples in our analysis, we don't need them in the output path, so we can subtract away the 'sample' key
	dictB = (zipdict | {'root': 'path/to/b'}) - 'sample'
	
	rule R1:
	    input:
	        path = bids_expandpath(**dictA)
	    …
	    output:
	        path = bids_expandpath(**dictB)

Similarly, if we want to form wildcard paths for both input and output, we may write

	entitydict = {
	    'datatype': 'micr',
	    'stain': '{stain}'
	}  # missing some keys, but used as an example
	dictA = entitydict | {'suffix': 'source.nii'}
	dictB = entitydict | {'suffix': 'target.nii'}
	
	rule R2:
	    input:
	        path = bids(**dictA)
	    …
	    output:
	        path = bids(**dictB)

Proposed Enhancement

I would like to propose a different way to pass parameters to bids() and expand() functions, as described above.

To do so, we need to move expand() out of BidsComponentRow/BidsPartialComponent class, and add a dict subclass to support parameter concatenation, overwrite, keyset removal and subsetting.

Environment

  • The version of snakebids is the main branch as of today

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions