Skip to content

Commit

Permalink
Color by attribute (#148)
Browse files Browse the repository at this point in the history
* initial commit with dropdown window and coloring based on position

* lint fixes

* interpolate between two opposite colors + added extra test features

* replaced dropdown from mui to czi-sds/components

* changed Button and InputText czi-sds/components elements, because of new version

* categorical/continuous colormaps

* fixed czi-sds/components after new version

* attributes are saved in zarr and loaded when requested in the application

* fixed width of dropdownMenu

* fixed mistakes in convert after merge with main

* replaced HTTPStore by .attrs.asObject

* updated conversion test

* reset dropdown after loading new dataset + removed hardcoded number of default attributes when fetching

* unified colormaps, toggle for colorBy, second colormap legend

* added colorBy colorbar overlay + added flag for pre-normalized attributes in conversion script

* changed default dataset(s) + color legend only visible when relevant

* changed text on colormap legend overlays

* demo ready PR for SI videos (removed some default attributes)

* added conversion tests + conversion detects categorical/continuous

* trying to fix lint issues

* fixed wrong dataset names

* toggle colorBy to off when new dataset is loaded

* colorBy toggle only visible when Zarr Store has attributes

* saved colorBy toggle and field in ViewerState, so it is preserved when copying URL

* config option to show/not show the default dropdown options like x-pos, y-pos, quadrants, etc.

* config option to disable coloring cells (even if attributes are provided)

* changed example default dataset (includes attributes)

* Andy sweet/refactor attr options (#157)

* Rough attempt at refactoring attr options

* Use regular dropdown

* Add some comments

* resolved Zarr3 issue, and npm complaint in leftSidebar/TrackControls

---------

Co-authored-by: Andy Sweet <[email protected]>

* implemented review Andy

* npm fix

* updated readme for adding attributes to Zarr with cli

* wip updating example notebook

* updated example notebook with coloring by attribute

* lint fixes

* traditional zebrafish dataset as default

---------

Co-authored-by: Andy Sweet <[email protected]>
  • Loading branch information
TeunHuijben and andy-sweet authored Jan 23, 2025
1 parent d9fbf2b commit d47f9d7
Show file tree
Hide file tree
Showing 19 changed files with 978 additions and 193 deletions.
15 changes: 14 additions & 1 deletion CONFIG.ts
Original file line number Diff line number Diff line change
Expand Up @@ -19,8 +19,17 @@ const config = {
// Maximum number of cells a user can select without getting a warning
max_num_selected_cells: 100,

// Choose colormap for the tracks, options: viridis-inferno, magma-inferno, inferno-inferno, plasma-inferno, cividis-inferno [default]
// Choose colormap for the tracks
// options: viridis-inferno, magma-inferno, inferno-inferno, plasma-inferno, cividis-inferno [default]
colormap_tracks: "cividis-inferno",

// Choose colormap for coloring the cells, when the attribute is continuous or categorical
// options: HSL, viridis, plasma, inferno, magma, cividis
colormap_colorby_categorical: "HSL",
colormap_colorby_continuous: "plasma",

// Show default attributes in the left dropdown menu for coloring the cells
showDefaultAttributes: true,

// Point size (arbitrary units), if cell sizes not provided in zarr attributes
point_size: 0.1,
Expand All @@ -33,7 +42,11 @@ const config = {

// Point color (when selector hovers over)
preview_hightlight_point_color: [0.8, 0.8, 0], //yellow
},

permission:{
// Allow users to color cells by attributes
allowColorByAttribute: true
}
}

Expand Down
12 changes: 12 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -119,6 +119,18 @@ intracktive convert --csv_file path/to/tracks.csv --add_radius

Or use `intracktive convert --help` for the documentation on the inputs and outputs

Additionally, inTRACKtive has the option of giving each cell a different color based on provided data attributes (see the example [Jupyter Notebook (`/napari/src/intracktive/examples`)](/python/src/intracktive/examples/notebook1_inTRACKtive_from_notebook.ipynb)). One can add any attributes to the Zarr file, as long as they are present as columns in the `tracks.csv` tracking data. Using the following command-line interface, you can add one/multiple/all columns as attributes to the data:
```
#add specific column as attribute
intracktive convert --csv_file path/to/file.csv --add_attribute cell_size
#add multiple columns as attributes
intracktive convert --csv_file path/to/file.csv --add_attribute cell_size,time,diameter,color
#add all columns as attributes
intracktive convert --csv_file path/to/tracks.csv --add_all_attributes
```
When using `add_all_attributes`, the code will add all given columns as an attribute, apart from the default columns (`track_id`, `t`, `z`, `y`, `x`, and `parent_track_id`). If desired, one can manually add these columns as attributes using `add_attribute x`, for example.

In order for the viewer to access the data, the data must be hosted at a location the browser can access. For testing and visualizing data on your own computer, the easiest way is to host the data via `localhost`. This repository contains a [tool](python/src/intracktive//server.py) to host the data locally:

Expand Down
81 changes: 80 additions & 1 deletion python/src/intracktive/_tests/test_cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ def _run_command(command_and_args: List[str]) -> None:
assert exit.code == 0, f"{command_and_args} failed with exit code {exit.code}"


def test_convert_cli(
def test_convert_cli_simple(
tmp_path: Path,
make_sample_data: pd.DataFrame,
) -> None:
Expand All @@ -28,3 +28,82 @@ def test_convert_cli(
str(tmp_path),
]
)


def test_convert_cli_single_attribute(
tmp_path: Path,
make_sample_data: pd.DataFrame,
) -> None:
df = make_sample_data
df.to_csv(tmp_path / "sample_data.csv", index=False)

_run_command(
[
"convert",
"--csv_file",
str(tmp_path / "sample_data.csv"),
"--out_dir",
str(tmp_path),
"--add_attribute",
"z",
]
)


def test_convert_cli_multiple_attributes(
tmp_path: Path,
make_sample_data: pd.DataFrame,
) -> None:
df = make_sample_data
df.to_csv(tmp_path / "sample_data.csv", index=False)

_run_command(
[
"convert",
"--csv_file",
str(tmp_path / "sample_data.csv"),
"--out_dir",
str(tmp_path),
"--add_attribute",
"z,x,z",
]
)


def test_convert_cli_all_attributes(
tmp_path: Path,
make_sample_data: pd.DataFrame,
) -> None:
df = make_sample_data
df.to_csv(tmp_path / "sample_data.csv", index=False)

_run_command(
[
"convert",
"--csv_file",
str(tmp_path / "sample_data.csv"),
"--out_dir",
str(tmp_path),
"--add_all_attributes",
]
)


def test_convert_cli_all_attributes_prenormalized(
tmp_path: Path,
make_sample_data: pd.DataFrame,
) -> None:
df = make_sample_data
df.to_csv(tmp_path / "sample_data.csv", index=False)

_run_command(
[
"convert",
"--csv_file",
str(tmp_path / "sample_data.csv"),
"--out_dir",
str(tmp_path),
"--add_all_attributes",
"--pre_normalized",
]
)
3 changes: 2 additions & 1 deletion python/src/intracktive/_tests/test_convert.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,8 @@ def test_actual_zarr_content(tmp_path: Path, make_sample_data: pd.DataFrame) ->
convert_dataframe_to_zarr(
df=df,
zarr_path=new_path,
extra_cols=["radius"],
add_radius=True,
extra_cols=(),
)

new_data = zarr.open(new_path)
Expand Down
127 changes: 110 additions & 17 deletions python/src/intracktive/convert.py
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,9 @@ def get_unique_zarr_path(zarr_path: Path) -> Path:
def convert_dataframe_to_zarr(
df: pd.DataFrame,
zarr_path: Path,
add_radius: bool = False,
extra_cols: Iterable[str] = (),
pre_normalized: bool = False,
) -> Path:
"""
Convert a DataFrame of tracks to a sparse Zarr store
Expand Down Expand Up @@ -113,11 +115,18 @@ def convert_dataframe_to_zarr(
flag_2D = True
df["z"] = 0.0

points_cols = (
["z", "y", "x", "radius"] if add_radius else ["z", "y", "x"]
) # columns to store in the points array
extra_cols = list(extra_cols)
columns = REQUIRED_COLUMNS + extra_cols
points_cols = ["z", "y", "x"] + extra_cols # columns to store in the points array

for col in columns:
columns_to_check = (
REQUIRED_COLUMNS + ["radius"] if add_radius else REQUIRED_COLUMNS
) # columns to check for in the DataFrame
columns_to_check = columns_to_check + extra_cols
print("point_cols:", points_cols)
print("columns_to_check:", columns_to_check)

for col in columns_to_check:
if col not in df.columns:
raise ValueError(f"Column '{col}' not found in the DataFrame")

Expand All @@ -144,7 +153,7 @@ def convert_dataframe_to_zarr(

n_tracklets = df["track_id"].nunique()
# (z, y, x) + extra_cols
num_values_per_point = 3 + len(extra_cols)
num_values_per_point = 4 if add_radius else 3

# store the points in an array
points_array = (
Expand All @@ -154,6 +163,15 @@ def convert_dataframe_to_zarr(
)
* INF_SPACE
)
attribute_array_empty = (
np.ones(
(n_time_points, max_values_per_time_point),
dtype=np.float32,
)
* INF_SPACE
)
attribute_arrays = {}
attribute_types = [None] * len(extra_cols)

points_to_tracks = lil_matrix(
(n_time_points * max_values_per_time_point, n_tracklets), dtype=np.int32
Expand All @@ -165,10 +183,25 @@ def convert_dataframe_to_zarr(
points_array[t, : group_size * num_values_per_point] = (
group[points_cols].to_numpy().ravel()
)

points_ids = t * max_values_per_time_point + np.arange(group_size)

points_to_tracks[points_ids, group["track_id"] - 1] = 1

for index, col in enumerate(extra_cols):
attribute_array = attribute_array_empty.copy()
for t, group in df.groupby("t"):
group_size = len(group)
attribute_array[t, :group_size] = group[col].to_numpy().ravel()
# check if attribute is categorical or continuous
if (
len(np.unique(attribute_array[attribute_array != INF_SPACE])) <= 10
): # get number of unique values, excluding INF_SPACE
attribute_types[index] = "categorical"
else:
attribute_types[index] = "continuous"
attribute_arrays[col] = attribute_array

LOG.info(f"Munged {len(df)} points in {time.monotonic() - start} seconds")

# creating mapping of tracklets parent-child relationship
Expand Down Expand Up @@ -233,16 +266,32 @@ def convert_dataframe_to_zarr(
chunks=(1, points_array.shape[1]),
dtype=np.float32,
)
print("points shape:", points.shape)
points.attrs["values_per_point"] = num_values_per_point

if len(extra_cols) > 0:
attributes_matrix = np.hstack(
[attribute_arrays[attr] for attr in attribute_arrays]
)
attributes = top_level_group.create_dataset(
"attributes",
data=attributes_matrix,
chunks=(1, attribute_array.shape[1]),
dtype=np.float32,
)
attributes.attrs["attribute_names"] = extra_cols
attributes.attrs["attribute_types"] = attribute_types
attributes.attrs["pre_normalized"] = pre_normalized

mean = df[["z", "y", "x"]].mean()
extent = (df[["z", "y", "x"]] - mean).abs().max()
extent_xyz = extent.max()

for col in ("z", "y", "x"):
points.attrs[f"mean_{col}"] = mean[col]

points.attrs["extent_xyz"] = extent_xyz
points.attrs["fields"] = ["z", "y", "x"] + extra_cols
points.attrs["fields"] = points_cols
points.attrs["ndim"] = 2 if flag_2D else 3

top_level_group.create_groups(
Expand Down Expand Up @@ -287,7 +336,11 @@ def convert_dataframe_to_zarr(
LOG.info(f"Saved to Zarr in {time.monotonic() - start} seconds")


def dataframe_to_browser(df: pd.DataFrame, zarr_dir: Path) -> None:
def dataframe_to_browser(
df: pd.DataFrame,
zarr_dir: Path,
extra_cols: Iterable[str] = (),
) -> None:
"""
Open a Tracks DataFrame in inTRACKtive in the browser. In detail: this function
1) converts the DataFrame to Zarr, 2) saves the zarr in speficied path (if provided, otherwise temporary path),
Expand All @@ -299,16 +352,18 @@ def dataframe_to_browser(df: pd.DataFrame, zarr_dir: Path) -> None:
The DataFrame containing the tracks data. The required columns in the dataFrame are: ['track_id', 't', 'z', 'y', 'x', 'parent_track_id']
zarr_dir : Path
The directory to save the Zarr bundle, only the path to the folder is required (excluding the zarr_bundle.zarr filename)
extra_cols : Iterable[str], optional
List of extra columns to include in the Zarr store, by default empty list
"""

if str(zarr_dir) in (".", None):
with tempfile.TemporaryDirectory() as temp_dir:
zarr_dir = Path(temp_dir)
logging.info("Temporary directory used for localhost: %s", zarr_dir)
LOG.info("Temporary directory used for localhost: %s", zarr_dir)
else:
logging.info("Provided directory used used for localhost: %s", zarr_dir)
LOG.info("Provided directory used used for localhost: %s", zarr_dir)

extra_cols = []
# extra_cols = []
zarr_path = (
zarr_dir / "zarr_bundle.zarr"
) # zarr_dir is the folder, zarr_path is the folder+zarr_name
Expand All @@ -324,17 +379,15 @@ def dataframe_to_browser(df: pd.DataFrame, zarr_dir: Path) -> None:
threaded=True,
)

logging.info(
"localhost successfully launched, serving: %s", zarr_dir_with_storename
)
LOG.info("localhost successfully launched, serving: %s", zarr_dir_with_storename)

baseUrl = "https://intracktive.sf.czbiohub.org" # inTRACKtive application
dataUrl = hostURL + "/zarr_bundle.zarr/" # exact path of the data (on localhost)
fullUrl = baseUrl + generate_viewer_state_hash(
data_url=str(dataUrl)
) # full hash that encodes viewerState
logging.info("Copy the following URL into the Google Chrome browser:")
logging.info("full URL: %s", fullUrl)
LOG.info("Copy the following URL into the Google Chrome browser:")
LOG.info("full URL: %s", fullUrl)
webbrowser.open(fullUrl)


Expand All @@ -358,10 +411,33 @@ def dataframe_to_browser(df: pd.DataFrame, zarr_dir: Path) -> None:
default=False,
type=bool,
)
@click.option(
"--add_all_attributes",
is_flag=True,
help="Boolean indicating whether to include extra columns of the CSV as attributes for colors the cells in the viewer",
default=False,
type=bool,
)
@click.option(
"--add_attribute",
type=str,
default=None,
help="Comma-separated list of column names to include as attributes (e.g., 'cell_size,diameter,type,label')",
)
@click.option(
"--pre_normalized",
is_flag=True,
help="Boolean indicating whether the extra column/columns with attributes are prenormalized to [0,1]",
default=False,
type=bool,
)
def convert_cli(
csv_file: Path,
out_dir: Path | None,
add_radius: bool,
add_all_attributes: bool,
add_attribute: str | None,
pre_normalized: bool,
) -> None:
"""
Convert a CSV of tracks to a sparse Zarr store
Expand All @@ -375,16 +451,33 @@ def convert_cli(

zarr_path = out_dir / f"{csv_file.stem}_bundle.zarr"

extra_cols = ["radius"] if add_radius else []

tracks_df = pd.read_csv(csv_file)

LOG.info(f"Read {len(tracks_df)} points in {time.monotonic() - start} seconds")

extra_cols = []
if add_all_attributes:
columns_standard = REQUIRED_COLUMNS
extra_cols = tracks_df.columns.difference(columns_standard).to_list()
print("extra columns included as attributes:", extra_cols)
elif add_attribute:
selected_columns = [col.strip() for col in add_attribute.split(",")]
missing_columns = [
col for col in selected_columns if col not in tracks_df.columns
]
if missing_columns:
raise ValueError(
f"Columns not found in the CSV file: {', '.join(missing_columns)}"
)
extra_cols = selected_columns
print(f"Selected columns included as attributes: {', '.join(extra_cols)}")

convert_dataframe_to_zarr(
tracks_df,
zarr_path,
add_radius,
extra_cols=extra_cols,
pre_normalized=pre_normalized,
)

LOG.info(f"Full conversion took {time.monotonic() - start} seconds")
Expand Down
Loading

0 comments on commit d47f9d7

Please sign in to comment.