You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Merlin uses a directed acyclic graph (DAG) to represent operations on data such as filtering or bucketing and to represent operations in a recommender system such as creating an ensemble or filtering candidate items during inference.
4
+
5
+
Understanding the Merlin DAG is helpful if you want to develop your own operator (Op) or building a recommender system with Merlin.
6
+
7
+
## DAG Terminology
8
+
9
+
node
10
+
: A node in the DAG is a group of columns and at least one _operator_.
11
+
The columns are specified with a _column selector_.
12
+
13
+
column selector
14
+
: A column selector specifies the columns to select from a dataset using column names or _tags_.
15
+
16
+
operator
17
+
: An operator performs a transformation on data and return a new _node_.
18
+
The data is identified by the _column selector_.
19
+
Some simple operators like `+` and `-` add or remove columns.
20
+
More complex operations are applied by shifting the operators onto the column selector with the `>>` notation.
21
+
22
+
schema
23
+
: A Merlin schema is metadata that describes the columns in a dataset.
24
+
Each column has its own schema that identifies the column name and can specify _tags_ and properties.
25
+
26
+
tag
27
+
: A Merlin tag categorizes information about a column.
28
+
Adding a tag to a column enables you to select columns for operations by tag rather than name.
29
+
30
+
For example, you can add the `USER_ID` and `ITEM_ID` tags to columns.
31
+
Modeling and inference operations can use that information to act accordingly on the dataset.
0 commit comments