Skip to content

Conversation

JackieZhai
Copy link

Due to the deprecated package "mala", I changed some of the packages to compact.
And the U-Net structure and the input/intermediate/output shapes were not matched.
(I changed the shapes here)

P.S., I found that the "volumes/labels/mask" of these datasets are not uploaded to "s3" bucket.
Could you please tell me if it is necessary for training procedures in "train.py"?

Due to the deprecated package "mala", I changed some of the packages to compact.
And the U-Net structure and the input/intermediate/output shapes were not matched.
(I changed the shapes here)

P.S., I found that the "volumes/labels/mask" of these datasets are not uploaded to "s3" bucket.
Could you please tell me if it is necessary for training procedures in "train.py"?
@funkey
Copy link
Member

funkey commented Nov 4, 2021

Thanks a lot @JackieZhai! @sheridana, can you please review and help with the label masks?

@sheridana
Copy link
Collaborator

@JackieZhai the training masks should be created from the label data prior to training. The masks are used to specify where to select a random location from. They should be 1 inside the labels training data, zero outside. For fib25 that is easy since that raw is not padded around the labels (so can use np.ones_like). For zebrafinch, the raw is padded around the labels but the labels shape differs to the raw so you can still use np.ones_like. For hemi, the labels are padded with a background label so would want to mask this. For example (assuming you downloaded the first training zarrs from each dataset):

import numpy as np
import zarr

samples = [
        'funke/fib25/training/trvol-250-1.zarr',
        'funke/hemi/training/eb-inner-groundtruth-with-context-x20172-y2322-z14332.zarr',
        'funke/zebrafinch/training/gt_z255-383_y1407-1663_x1535-1791.zarr'
        ]

labels_name = 'volumes/labels/neuron_ids'
labels_mask_name = 'volumes/labels/labels_mask'

for sample in samples:

    f = zarr.open(sample, 'a')

    labels = f[labels_name][:]
    offset = f[labels_name].attrs['offset']
    resolution = f[labels_name].attrs['resolution']

    labels_mask = np.ones_like(labels).astype(np.uint8)

    if 'hemi' in sample:
        background_mask = labels == np.uint64(-3)
        labels_mask[background_mask] = 0

    f[labels_mask_name] = labels_mask
    f[labels_mask_name].attrs['offset'] = offset
    f[labels_mask_name].attrs['resolution'] = resolution

@JackieZhai
Copy link
Author

Super comprehensive explanation about masking these 3 datasets, exactly answering my questions from "hemi"! Thank @funkey @sheridana for your help. 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants