Skip to content

Add concept of block events and some design notes #1059

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 19 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
92 changes: 92 additions & 0 deletions pydatalab/docs/design/blocks.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
# Data blocks

## Overview

*datalab*'s block system provides a modular approach to data processing and visualization.
Each block type is a specialized component that handles specific kinds of data and operations, making it easy to extend the system's capabilities without modifying the core architecture.
Typically, a given technique (e.g., XRD, NMR) will have its own block.
Blocks can be implemented either in the main package, or as a plugin (see ["Plugin Development"](design/plugins.md)).

Data blocks are modular components that:

1. Process specific file types and data formats for a technique or set of techniques,
2. Generate visualizations and plots from this data to be shown in the UI,
3. Store and manage their own state persistently in a database,
4. Can be attached to individual items or collections in your data management system,
5. Provide a mechanism for handling "events" through a decorator-based registration system,
6. Expose a consistent API for creation, updating, and deletion.
7. Handle logging, errors and warnings in a consistent way to show in the UI.

## Block Lifecycle

1. **Creation**: Blocks are instantiated with an item or collection ID
2. **Initialization**: Initial state is set up, potentially including file data and defaults
3. **Processing**: Data is processed, plots are generated, and state is updated
4. **Serialization**: Block state is serialized for storage or transmission
5. **Update**: Blocks can receive updates from the web interface
6. **Deletion**: Blocks can be removed from items or collections


## Web API

The block system exposes several API endpoints:

- `/add-data-block/`: Create and add a new block to an item
- `/add-collection-data-block/`: Create and add a new block to a collection
- `/update-block/`: Update an existing block's state
- `/delete-block/`: Remove a block from an item
- `/delete-collection-block/`: Remove a block from a collection

## Creating a new block

To create a new block type:

1. Create a class that inherits from `DataBlock`
2. Define the accepted file extensions and block metadata (descriptions will be used to populate the UI documentation automatically)-
3. Implement data processing and visualization methods, with e.g., JSON-serialized Bokeh plots stored in the `self.data["bokeh_plot_data"]` attribute
4. Any data to be stored in the database can be defined in the `self.data` attribute
5. Register any event handlers using the `@event` decorator
5. Add the block type to the `BLOCK_TYPES` registry

By default, a generic UI component will be used in the *datalab* interface that
will make use of titles, descriptions, accepted file extensions to render a
simple user interface for the block.
When the user loads the block in the UI, the block's `plot_functions` methods
will be called in turn, which will either load from scratch, or load cached data
for that block.
If a JSON-serialized Bokeh plot is found in the block's data, this will be
rendered in the UI.

## Event system

The event system allows external functions to be called by name, enabling clean interaction between the frontend and server-side block functionality.
This is a new feature and this documentation will evolve alongside it.

Currently, the event system allows:

- Registration of event handlers in Python via the `@event` decorator
- Access to available events at both class and instance levels
- Runtime dispatch of events based on name
- Support for event parameters passed as keyword arguments
- Events can then be triggered by the front-end; for example, a Bokeh-based block can trigger an event in a Bokeh callback using the [`CustomEvent`](https://developer.mozilla.org/en-US/docs/Web/API/CustomEvent/CustomEvent) API, for example:
```javascript
const event = new CustomEvent("bokehStateUpdate", {
detail: {
event_name: '<event_name>',
state_data: '<some data>',
},
bubbles: true
});
document.dispatchEvent(event);
```
The base data block (`DataBlockBase.vue`) will listen for such events registered as `'bokehStateUpdate'` and pass them to the appropriate server-side block.


## Future Directions

Future updates to the block system will focus on:

- Reducing boilerplate code required for new block types
- Enhanced automatic caching after block creation
- Improving the event system to enable richer UI interactions, e.g,. setting user parameters or controlling default plot styles.
- Providing better support for custom user interfaces (i.e., allowing plugins to also specify custom Vue code).
35 changes: 32 additions & 3 deletions pydatalab/src/pydatalab/apps/xrd/blocks.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
import pandas as pd
from scipy.signal import medfilt

from pydatalab.blocks.base import DataBlock
from pydatalab.blocks.base import DataBlock, event
from pydatalab.bokeh_plots import DATALAB_BOKEH_THEME, selectable_axes_plot
from pydatalab.file_utils import get_file_info_by_id
from pydatalab.logger import LOGGER
Expand All @@ -27,9 +27,19 @@ class XRDBlock(DataBlock):
def plot_functions(self):
return (self.generate_xrd_plot,)

@classmethod
@event
def set_wavelength(self, wavelength: float | None):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you show how this gets implemented in the XRD block before we merge? That would be a good way to test that everything is working on the front end

if wavelength is None:
wavelength = self.defaults["wavelength"]
elif wavelength <= 0:
raise ValueError("Wavelength must be a positive number")

LOGGER.debug(f"Setting wavelength to {wavelength} for block {self.block_id}")
self.data["wavelength"] = wavelength

@staticmethod
def load_pattern(
self, location: str, wavelength: float | None = None
location: str, wavelength: float | None = None
) -> Tuple[pd.DataFrame, List[str]]:
if not isinstance(location, str):
location = str(location)
Expand Down Expand Up @@ -174,6 +184,25 @@ def generate_xrd_plot(self):
plot_line=True,
plot_points=True,
point_size=3,
parameters={
"wavelength": {
"label": "Wavelength (Å)",
"value": self.data["wavelength"],
"event": (
"""
console.log("dispatching event");
const event = new CustomEvent('bokehStateUpdate', {
detail: {
event_name: 'set_wavelength',
wavelength: event.target.value_throttled,
},
bubbles: true
});
document.dispatchEvent(event);
"""
),
}
},
)

self.data["bokeh_plot_data"] = bokeh.embed.json_item(p, theme=DATALAB_BOKEH_THEME)
123 changes: 110 additions & 13 deletions pydatalab/src/pydatalab/blocks/base.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
import functools
import random
import warnings
from typing import Any, Callable, Dict, Optional, Sequence
Expand All @@ -9,6 +10,33 @@
__all__ = ("generate_random_id", "DataBlock")


def event(func: Callable | None = None) -> Callable:
"""Decorator to register an event with a block."""

def decorator(f):
@functools.wraps(f)
def wrapper(*args, **kwargs):
return f(*args, **kwargs)

wrapper._is_event = True
return wrapper

if func:
return decorator(func)

return decorator


class classproperty:
"""Decorator that creates a class-level property."""

def __init__(self, method=None):
self.method = method

def __get__(self, instance, cls=None):
return self.method(cls)


def generate_random_id():
"""This function generates a random 15-length string for use as an id for a datablock. It
should be sufficiently random that there is a negligible risk of ever generating
Expand All @@ -33,7 +61,7 @@ def generate_random_id():
class DataBlock:
"""Base class for a data block."""

name: str
name: str = "base"
"""The human-readable block name specifying which technique
or file format it pertains to.
"""
Expand Down Expand Up @@ -63,8 +91,8 @@ def __init__(
self,
item_id: Optional[str] = None,
collection_id: Optional[str] = None,
init_data=None,
unique_id=None,
init_data: dict | None = None,
unique_id: str | None = None,
):
"""Create a data block object for the given `item_id` or `collection_id`.

Expand All @@ -73,6 +101,7 @@ def __init__(
collection_id: The collection to which the block is attached.
init_data: A dictionary of data to initialise the block with.
unique_id: A unique id for the block, used in the DOM and database.

"""
if init_data is None:
init_data = {}
Expand Down Expand Up @@ -137,13 +166,13 @@ def to_db(self):
return self.data

@classmethod
def from_db(cls, db_entry):
"""create a block from json (dictionary) stored in a db"""
def from_db(cls, block: dict):
"""Create a block from data stored in the database."""
LOGGER.debug("Loading block %s from database object.", cls.__class__.__name__)
new_block = cls(
item_id=db_entry.get("item_id"),
collection_id=db_entry.get("collection_id"),
dictionary=db_entry,
item_id=block.get("item_id"),
collection_id=block.get("collection_id"),
init_data=block,
)
if "file_id" in new_block.data:
new_block.data["file_id"] = str(new_block.data["file_id"])
Expand All @@ -153,7 +182,7 @@ def from_db(cls, db_entry):

return new_block

def to_web(self) -> Dict[str, Any]:
def to_web(self) -> dict[str, Any]:
"""Returns a JSON serializable dictionary to render the data block on the web."""
block_errors = []
block_warnings = []
Expand Down Expand Up @@ -188,8 +217,70 @@ def to_web(self) -> Dict[str, Any]:

return self.data

def process_events(self, events: list[dict] | dict):
"""Handle any supported events passed to the block."""
if isinstance(events, dict):
events = [events]

for event in events:
# Match the event to any registered by the block
if (event_name := event.pop("event_name")) in self.event_names:
# Bind the method to the instance before calling
bound_method = self.__class__.events_by_name[event_name].__get__(
self, self.__class__
)
try:
bound_method(**event)
except Exception as e:
LOGGER.error(
f"Error processing event {event_name} for block {self.__class__.__name__}: {e}"
)
self.data["errors"] = [
f"{self.__class__.__name__}: Error processing event {event}: {e}"
]

@event()
def null_event(self, **kwargs):
"""A null debug event that does nothing but logs its kwargs and overwrites the data dict with the args."""
LOGGER.debug(
"Null event received by block %s with kwargs: %s", self.__class__.__name__, kwargs
)
self.data["kwargs"] = kwargs["kwargs"]

@classmethod
def _get_events(cls) -> dict[str, Callable]:
events = {}
# Loop over parent classes to find events
for c in cls.__mro__:
for name, method in c.__dict__.items():
if hasattr(method, "_is_event"):
events[name] = method

return events

@classproperty
def event_names(cls) -> set[str]:
"""Return a list of event names supported by this block."""
return set(cls.events_by_name.keys())

@classproperty
def events_by_name(cls) -> dict[str, Callable]:
"""Returns a dict of registered events for this block."""
return {
name: method
for name, method in cls._get_events().items()
if getattr(method, "_is_event", False)
}

@classmethod
def from_web(cls, data):
def from_web(cls, data: dict):
"""Initialise the block state from data passed via web request
with a given item, collection and block ID.

Parameters:
data: The block data to initialiaze the block with.

"""
LOGGER.debug("Loading block %s from web request.", cls.__class__.__name__)
block = cls(
item_id=data.get("item_id"),
Expand All @@ -199,9 +290,15 @@ def from_web(cls, data):
block.update_from_web(data)
return block

def update_from_web(self, data):
"""update the object with data received from the website. Only updates fields
that are specified in the dictionary- other fields are left alone"""
def update_from_web(self, data: dict):
"""Update the block with data received from a web request.

Only updates fields that are specified in the dictionary - other fields are left alone

Parameters:
data: A dictionary of data to update the block with.

"""
LOGGER.debug(
"Updating block %s from web request",
self.__class__.__name__,
Expand Down
17 changes: 17 additions & 0 deletions pydatalab/src/pydatalab/bokeh_plots.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@
TableColumn,
)
from bokeh.models.widgets import Select
from bokeh.models.widgets.inputs import NumericInput
from bokeh.palettes import Accent, Dark2
from bokeh.plotting import ColumnDataSource, figure
from bokeh.themes import Theme
Expand Down Expand Up @@ -153,6 +154,7 @@ def selectable_axes_plot(
plot_index: Optional[int] = None,
tools: Optional[List] = None,
show_table: bool = False,
parameters: Optional[Dict] = None,
**kwargs,
):
"""
Expand Down Expand Up @@ -363,6 +365,21 @@ def selectable_axes_plot(
if len(df) <= 1:
p.legend.visible = False

input_widgets = []
if parameters:
for parameter in parameters.values():
input_widget = NumericInput(
title=parameter["label"], value=parameter["value"], low=0.01
)
if parameter["event"]:
input_widget.js_on_change(
"value_throttled", *[CustomJS(args=dict(code=parameter["event"]))]
)
input_widgets.append(input_widget)

if input_widgets:
plot_columns.extend(input_widgets)

if not skip_plot:
plot_columns.append(p)
if len(x_options) > 1:
Expand Down
Loading
Loading