-
Notifications
You must be signed in to change notification settings - Fork 30
Workflow description format
HyperFlow uses a simple JSON format to describe workflows. The format of the file has a very simple structure:
{
"name": "Hello", // name of the workflow
"processes": [ ... ], // array of vertices of the workflow graph (called "processes")
"signals": [ ... ] // array of edges of the workflow graph (called "signals")
}
For example this structure:
{
"name": "Hello",
"processes": [ {
"name": "Node_0", // name of the "process" (should be unique)
"ins": [ 0 ], // input edges ("signals") (array of indexes in the "signals" array)
"outs": [ 1 ] // output edges
}, {
"name": "Node_1",
"ins": [ 1 ],
"outs": [ 2 ]
} ],
"signals": [ {
"name": "sig_0" // name of the signal (should be unique)
}, {
"name": "sig_1"
}, {
"name": "sig_2"
} ]
}
describes the following graph:

Note that in HyperFlow the workflow graph is a multigraph which means that a given pair of vertices may be connected by multiple edges. For example, each edge may denote a file that is produced by one task and consumed by another.
{
"name": "Sqr",
"type": "dataflow",
"function": "sqr",
"ins": [ "number" ], // instead of array indexes, signal names can also be used
"outs": [ "square" ]
}
name (string, mandatory) - unique name of the process
type (string, optional) - type of the process (default value: dataflow)
function (string, mandatory) - name of the JavaScript function that will be invoked when the process is activated
parlevel (integer, optional) - a number denoting how many activations of the process can be executed concurrently (default 1, 0 means infinite). See Sqrsum example.
ordering (string true or false, optional) - a flag denoting whether outputs of concurrent activations of a process should be ordered or not. See Sqrsum example.
firingLimit (integer, optional) - a number denoting the maximum number of activations of this process (unbounded if undefined)
firintInterval (integer, optional) - time interval (in miliseconds) at which the process should be activated (only relevant for processes with no input signals, see Streaming Map/Reduce example.
{
"name": "number",
"data": [ 1, 2, 3, 4, 5, 6 ]
}
If data is present, it contains a sequence of data elements (instances of that signal) that will be sent to the workflow after it has started.
HyperFlow workflow description supports variable interpolation. You can put {{var_name}} variables in the workflow.json, and provide values for these variables in one of the following ways:
- Through
hflowcommand line parameter--var, e.g.
hflow run <wf_dir> --var="function=command_print" --var="workdir=/home/workdir"
- Through environment variables starting with
HF_VAR_, e.g.
export HF_VAR_function=command_print
This will result in replacing all occurrences of {{function}} in workflow.json with command_print when the workflow is run with the hflow run command.